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AMPLIFIED AND OVEREXPRESSED GENE IN COLORECTAL 

CANCERS 

CROSS-REFERENCES TO RELATED APPLICATIONS 
[0001) This application is a continuation in part of and claims the benefit of priority of U.S. 
5 Patent Application No. 10/346,367, filed on January 15, 2003, which is herein incorporated 
by reference for all purposes. 

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER 
FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT 
1 0 [0002] This invention was made with Government support by Grant No. CA32737, 

awarded by the National Institutes of Health. The Government has certain rights in this 

invention. 

FIELD OF THE INVENTION 

[0003] This invention relates to methods to diagnose colon cancer and other proliferative 
15 diseases. 

BACKGROUND OF THE INVENTION 

[0004] Chromosome abnormalities are often associated with genetic disorders, 
degenerative diseases, and cancer. The deletion or multiplication of copies of whole 
chromosomes and the deletion or amplifications of chromosomal segments or specific 

20 regions are common occurrences in cancer (Smith (1991) Breast Cancer Res. Treat 1 8: 
Suppl. 1:5-14; van de Vijer (1991) Biochim. Biophys. Acta. 1072:33-50). In fact, 
amplifications and deletions of DNA sequences can be the cause of a cancer. For example, 
proto-oncogenes and tumor-suppressor genes, respectively, are frequently characteristic of 
tumorigenesis (Dutrillaux (1990) Cancer Genet. Cytogenet. 49: 203-217). Clearly, the 

25 identification and cloning of specific genomic regions associated with cancer is crucial both 
to the study of tumorigenesis and in developing better means of diagnosis and prognosis. 

[0005] One of the amplified regions found in studies of breast and colon cancer cells is on 
chromosome 20, specifically, 20ql3.2 (see, e.g. WO98/02539). Amplification of 20ql3.2 
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was subsequently found to occur in a variety of tumor types and to be associated with 
aggressive tumor behavior. Increased 20ql3.2 copy number has been found in 40% of breast 
cancer cell lines and 18% of primary breast tumors (Kalliioniemi (1994) Proc. Natl Acad. 
Sci. USA 91 : 2156-2160). Copy number gains at 20ql3.2 have also been reported in greater 
5 than 25% of cancers of the ovary (Iwabuchi (1995) Cancer Res. 55:6172-6180), colon 

(Schlegel (1995) Cancer Res. 55: 6002-6005 and WO02/06526), head-and-neck (Bockmuhl 
(1996) Laryngor. 75: 408-414), brain (Mohapatra (1995) Genes Chromosomes Cancer 13: 
86-93), pancreas (Solinas-Toldo (1996) Genes Chromosomes Cancer 20:399-407). 

[0006] A number of studies have elucidated genetic alterations that occur during the 
10 development of colorectal tumors. For instance, deletions of p53 genes on chromosome 17p 
are often late events associated with the transition from the benign (adenoma) to the 
malignant (carcinoma) state. See Vogelstein et al, New England Journal of Medicine, 
319:525 (1988), Fearon and Vogelstein, Cell, 61:759-767 (1990) and Baker etal Cancer 
Res. 50:1111-22 (1990). More recently, comparative genomic hybridization has shown that 
15 specific patterns of chromosomal gains and losses take place during colorectal carcinogenesis 
(see, e.g. Schlegel, et al. Cancer Research. 55, 6002-6005 (1995); Ried, et al. Genes, 
Chromosomes & Cancer 15, 234-245 (1996); and Nakao et al, Jpn. J. Surg. 28, 567-569 
(1998). These changes included overrepresentation (amplification) of large portion of 
chromosome 20 material. 

20 [0007] The identification of new genes that are responsible for carcinogenesis is obviously 
great use in diagnosis, prognosis and treatment of these diseases. The present invention 
addresses these and other needs. 

BRIEF SUMMARY OF THE INVENTION 

[0003] This invention provides a method for determining the presence or absence of a 
25 colorectal cancer cell in a patient, by determining the level of a target nucleic acid that 

encodes the 26#77 protein (e.g., SEQ ID NO: 2) in a biological sample from the patient. In 
one embodiment, the target nucleic acid comprises a sequence at least 80% identical to SEQ 
ID NO: 1. In a further embodiment, the biological sample can include isolated nucleic acids. 
In another embodiment, the nucleic acids are amplified before the level of the target nucleic 
30 acid is determined. In an additional embodiment the isolated nucleic acids are mRNA. 
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[0009] The invention also provides a method for determining the presence or absence of a 
colorectal cancer cell in a patient, by determining the level of a target nucleic acid that 
encodes the Copine 1 (CPNE 1) protein, the Integrin B4 binding protein (ITGB4BP), RNA 
Export homolog (RAE1), bone morphogenic protein 7 (BMP7), G protein, alpha stimulating 
5 activity polypeptide 1 (GNAS), eukaryotic translation initiation factor 2, subunit 2 beta 
(EIF2S2), dynein light chain A2 (DNCL2A), proteosome subunit Or! (PSMA7), activity 
dependent neuroprotector (ADNP), C20orfl29, C20orf52, C20orf20, or C20orfl88 {e.g., 
SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, or 28) in a biological sample from the 
patient. In one embodiment, the target nucleic acid comprises a sequence at least 80% 
10 identical to SEQ ID NO:3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, or 27. In a further 
embodiment, the biological sample can include isolated nucleic acids. In another 
embodiment, the nucleic acids are amplified before the level of the target nucleic acid is 
determined. In an additional embodiment the isolated nucleic acids are mRNA. 

[0010] In one aspect, the biological sample is colorectal tissue and the step of determining 
15 the level of target nucleic acid is carried out using in situ hybridization. 

[0011] In another aspect, the step of determining the level of target nucleic acid is carried 
out using a labeled nucleic acid probe that selectively hybridizes to SEQ ID NO: 1, 3, 5, 7, 9, 
11,13, 15, 17, 19, 21, 23, 25, or 27 under stringent hybridization conditions. The nucleic 
acid probe can be immobilized to a solid support. In a further aspect, the step of determining 
20 the level of target nucleic acid is carried out using Northern blot analysis. 

[0012] In one embodiment, the step of determining the level of the target nucleic acid is 
carried out by comparing the amount of the target nucleic acid in the biological sample to the 
amount of the target nucleic acid in a reference sample. The reference sample can be from 
normal colorectal tissue. 

25 [0013] In another embodiment, the levels of 26#77 encoding nucleic acid are determined 
when the patient is undergoing a therapeutic regimen to treat colorectal cancer. The levels of 
26#77 encoding nucleic acid can also be determined when the patient is suspected of having 
colorectal cancer. Similarly, levels of CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, 
DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20or£20, or C20orfl88 encoding nucleic 

30 acid are determined when the patient is undergoing a therapeutic regimen to treat colorectal 
cancer, and can also be determined when the patient is suspected of having colorectal cancer. 
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[0014] In one embodiment this invention provides an isolated expression vector with a 
nucleic acid sequence that encodes SEQ ID NO: 2, the 26#77 protein. In further 
embodiments, the nucleic acid sequence is at least 80% identical to SEQ ID NO: 1. The 
invention also provides a host cell containing a vector that expresses a nucleic acid that 
5 ! encodes the 26#77 protein. Nucleic acids that encode SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 
20, 22, 24, 26, or 28 are also provided, as well as host cells containing a vector that expresses 
a nucleic acid that encodes CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, 
PSMA7, ADNP, C20orfl29, C20or£52, C20or£20, or C20orfl88 protein. 

[0015] In one embodiment this invention provides a method for determining the presence 
1 0 i or absence of a colorectal cancer cell in a patient, by determining the level of a target protein, 
the 26#77 protein including the sequence shown in SEQ ID NO: 2. Levels of the 26#77 
protein are determined in a biological sample from the patient, thereby determining the 
presence or absence of the colorectal cancer cell in the patient. In one aspect the 26#77 
protein levels are determined by using an antibody specific for the 26#77 protein. The 
15 antibody can be a polyclonal antibody or a monoclonal antibody. In a further aspect, the 
antibody can be labeled and the label can be a fluorescent label. 

[0016] In one embodiment this invention provides a method for determining the presence 
or absence of a colorectal cancer cell in a patient, by determining the level of a target protein, 
the CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 

20 C20orfl29, C20orf52, C20or£20, or C20orfl88 protein including the sequences shown in 
SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, or 28. Levels of the CPNE 1, 
ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20or£20, or C20orfl88 protein are determined in a biological sample from the 
patient, thereby determining the presence or absence of the colorectal cancer cell in the 

25 patient. In one aspect the CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, 
PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 protein levels are 
determined by using an antibody specific for the CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, 
EEF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 protein. 
The antibody can be a polyclonal antibody or a monoclonal antibody. In a further aspect, the 

30 antibody can be labeled and the label can be a fluorescent label. 

[0017] In a further embodiment, the step of determination of the level of the 26#77 protein 
is carried out by comparing the amount of the 26#77 protein in the biological sample to the 



4 



WO 2004/064601 




PCT/US2004/001153 



amount of the 26#77 protein in a reference sample. In one aspect, the reference sample is 
from normal colorectal tissue. In another aspect, the determination of 26#77 protein level is 
made when the patient is undergoing a therapeutic regimen to treat colorectal cancer. In a 
further aspect, the determination of 26#77 protein level is made when the patient is suspected 
5 of having colorectal cancer. Similarly, the step of determination of the level of the CPNE 1, 
ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20or£20, or C20orfl88 protein is carried out by comparing the amount of the 
CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20orf20, or C20orfl88 protein in the biological sample to the amount of the 

10 CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EEF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20or£52, C20orf20, or C20orfl88 protein in a reference sample. In one aspect, the 
reference sample is from normal colorectal tissue. In another aspect, the determination of 
CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20orf20, or C20orfl88 protein level is made when the patient is undergoing a 

15 therapeutic regimen to treat colorectal cancer. In a further aspect, the determination of CPNE 
1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20orf20, or C20orfl88 protein level is made when the patient is suspected of 
having colorectal cancer. 

[0018] In one embodiment, the present invention provides a method for treating a cancer 
20 that overexpresses a 26#77 gene product by administering a therapeutically effective amount 
of an inhibitor of 26#77 gene product to a patient who has a cancer that overexpresses 26#77. 
The inhibitor of the 26#77 gene product can be an antibody, an antisense RNA molecule, or 
an inhibitory RNA molecule. Similarly, the present invention provides a method for treating 
a cancer that overexpresses a CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, 
25 PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 gene product by 

administering a therapeutically effective amount of an inhibitor of CPNE 1, ITGB4BP, 
RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, 
or C20orfl88 gene product to a patient who has a cancer that overexpresses CPNE 1, 
ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
30 C20orf52, C20orf20, or C20orfl88. The inhibitor of the CPNE 1, ITGB4BP, RAE1, BMP7, 
GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 
gene product can be an antibody, an antisense RNA molecule, or an inhibitory RNA 
molecule. 
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DEFINITIONS 

[0019] The phrase "determining the level of a target nucleic acid " refers to any method that 
can be used to detect increased copy number of a genomic sequence or increased expression 
level of a target gene. Methods for determining increased copy number are well known and 
5 include nucleic acid hybridization methods as described below. Methods for determining the 
level of expression of a particular gene are well known in the art. Such methods include RT- 
PCR, real-time PCR, use of antibodies against the gene products, and the like. As explained 
below, methods of the invention are used to detect increased copy number or overexpression 
of a gene referred to here as 26#77 or to detect overexpression CPNE 1, ITGB4BP, RAE1, 
10 BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or 
C20orfl88 genes. Typically, overexpression of a particular gene is at least about 2 times, 
usually at least about 5 times the level of expression in a normal cell from the same tissue. 

[0020] The terms "26#77 protein" or "26#77polynucleotide" or refer to nucleic acid and 
polypeptide polymorphic variants, alleles, mutants, and interspecies homologues of SEQ ID 

15 NO: 1 or SEQ ID NO: 2. Typically such genes or proteins have a sequence that has greater 
than about 70% nucleotide sequence identity, usually 80%, 85%, 90% or 99% or greater 
sequence identity to SEQ ID NO: 1 or SEQ ID NO: 2, preferably over a region of over a 
region of at least about 25, 50, 100, 200, 500, 1000, or more residues. A polynucleotide or 
polypeptide sequence is typically from a mammal including, but not limited to, primate, e.g., 

20 human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or other mammal. These 
terms include both naturally occurring or recombinant forms. 

[0021] The terms "copine 1(CPNE 1) protein" or "copine l(CPNEl) polynucleotide" or 
refer to nucleic acid and polypeptide polymorphic variants, alleles, mutants, and interspecies 
homologues of SEQ ID NO: 3 or SEQ ID NO: 4. Typically such genes or proteins have a 

25 sequence that has greater than about 70% nucleotide sequence identity, usually 80%, 85%, 

90% or 99% or greater sequence identity to SEQ ID NO: 3 or SEQ ID NO: 4, preferably over 
a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or more residues. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 

30 other mammal. These terms include both naturally occurring or recombinant forms. 

[0022] The terms "integrin B4 binding protein (TTGB4BP) protein" or "integrin B4 binding 
protein (TTGB4BP) polynucleotide" or refer to nucleic acid and polypeptide polymorphic 
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variants, alleles, mutants, and interspecies homologues of SEQ ID NO: 5 or SEQ ID NO: 6. 
Typically such genes or proteins have a sequence that has greater than about 70% nucleotide 
sequence identity, usually 80%, 85%, 90% or 99% or greater sequence identity to SEQ ID 
NO: 5 or SEQ ID NO: 6, preferably over a region of over a region of at least about 25, 50, 
5 100, 200, 500, 1000, or more residues. A polynucleotide or polypeptide sequence is typically 
from a mammal including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, 
hamster; cow, pig, horse, sheep, or other mammal. These terms include both naturally 
occurring or recombinant forms. 

[0023] The terms "RNA export homolog (RAE) protein" or "RNA export homolog (RAE) 
1 0 polynucleotide" or refer to nucleic acid and polypeptide polymorphic variants, alleles, 

mutants, and interspecies homologues of SEQ ID NO: 7 or SEQ ID NO: 8. Typically such 
genes or proteins have a sequence that has greater than about 70% nucleotide sequence 
identity, usually 80%, 85%, 90% or 99% or greater sequence identity to SEQ ID NO: 7 or 
SEQ ID NO: 8, preferably over a region of over a region of at least about 25, 50, 100, 200, 
15 500, 1000, or more residues. A polynucleotide or polypeptide sequence is typically from a 
mammal including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; 
cow, pig, horse, sheep, or other mammal. These terms include both naturally occurring or 
recombinant forms. 

[0024] The terms <c bone moiphogenic protein 7 (BMP7) protein" or "bone morphogenic 
20 protein 7 (BMP7) polynucleotide" or refer to nucleic acid and polypeptide polymorphic 

variants, alleles, mutants, and interspecies homologues of SEQ ID NO: 9 or SEQ ID NO: 10. 
Typically such genes or proteins have a sequence that has greater than about 70% nucleotide 
sequence identity, usually 80%, 85%, 90% or 99% or greater sequence identity to SEQ ID 
NO: 9 or SEQ ID NO: 10, preferably over a region of over a region of at least about 25, 50, 
25 100, 200, 500, 1000, or more residues. A polynucleotide or polypeptide sequence is typically 
from a mammal including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, 
hamster; cow, pig, horse, sheep, or other mammal. These terms include both naturally 
occurring or recombinant forms. 

[0025] The terms "G protein, alpha stimulating activity polypeptide 1 (GNAS) protein" or 
30 "G protein, alpha stimulating activity polypeptide 1 (GNAS) polynucleotide" or refer to 
nucleic acid and polypeptide polymorphic variants, alleles, mutants, and interspecies 
homologues of SEQ ID NO: 1 1 or SEQ ID NO: 12. Typically such genes or proteins have a 
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sequence that has greater than about 70% nucleotide sequence identity, usually 80%, 85%, 
90% or 99% or greater sequence identity to SEQ ID NO: 1 1 or SEQ ID NO: 12, preferably 
over a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or more residues. 
A polynucleotide or polypeptide sequence is typically from a mammal including, but not 
5 limited to, primate, e.g., human; rodent, e.g., rat, mouse* hamster; cow, pig, horse, sheep, or 
other mammal. These terms include both naturally occurring or recombinant forms. 

[0026] The terms "eukaryotic translation initiation factor 2, subunit 2 beta (EIF2S2) 
protein" or "eukaryotic translation initiation factor 2, subunit 2 beta (EIF2S2) 
polynucleotide" or refer to nucleic acid and polypeptide polymorphic variants, alleles, 

1 0 mutants, and interspecies homologues of SEQ ID NO: 13 or SEQ ID NO: 14. Typically such 
genes or proteins have a sequence that has greater than about 70% nucleotide sequence 
identity, usually 80%, 85%, 90% or 99% or greater sequence identity to SEQ ID NO: 13 or 
SEQ ID NO: 14, preferably over a region of over a region of at least about 25, 50, 100, 200, 
500, 1000, or more residues. A polynucleotide or polypeptide sequence is typically from a 

15 mammal including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; 
cow, pig, horse, sheep, or other mammal. These terms include both naturally occurring or 
recombinant forms. 

[0027] The terms "dynein light chain A2 (DNCL2A) protein" or "dynein light chain A2 
(DNCL2A) polynucleotide" or refer to nucleic acid and polypeptide polymorphic variants, 

20 alleles, mutants, and interspecies homologues of SEQ ID NO: 15 or SEQ ID NO: 16. 

Typically such genes or proteins have a sequence that has greater than about 70% nucleotide 
sequence identity, usually 80%, 85%, 90% or 99% or greater sequence identity to SEQ ID 
NO: 15 or SEQ ID NO: 16, preferably over a region of over a region of at least about 25, 50, 
100, 200, 500, 1000, or more residues. A polynucleotide or polypeptide sequence is typically 

25 from a mammal including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, 
hamster; cow, pig, horse, sheep, or other mammal. These terms include both naturally 
occurring or recombinant forms. 

[0028] The terms "proteosome subunit a-7 (PSMA7) protein'* or "proteosome subunit a-7 
(PSMA7) polynucleotide" or refer to nucleic acid and polypeptide polymorphic variants, 
30 alleles, mutants, and interspecies homologues of SEQ ID NO: 17 or SEQ ID NO: 18. 

Typically such genes or proteins have a sequence that has greater than about 70% nucleotide 
sequence identity, usually 80%, 85%, 90% or 99% or greater sequence identity to SEQ ID 
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NO: 17orSEQIDNO: 18, preferably over a region of over a region of at least about 25, 50, 
100, 200, 500, 1000, or more residues. A polynucleotide or polypeptide sequence is typically 
from a mammal including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, 
hamster; cow, pig, horse, sheep, or other mammal. These terms include both naturally 
5 occurring or recombinant forms. 

[0029] The terms "activity dependent neuroprotector (ADNP) protein" or "activity 
dependent neuroprotector (ADNP) polynucleotide" or refer to nucleic acid and polypeptide 
polymorphic variants, alleles, mutants, and interspecies homologues of SEQ ID NO: 19 or 
SEQ ID NO: 20. Typically such genes or proteins have a sequence that has greater than 

10 about 70% nucleotide sequence identity, usually 80%, 85%, 90% or 99% or greater sequence 
identity to SEQ ID NO: 19 or SEQ ID NO: 20, preferably over a region of over a region of at 
least about 25, 50, 100, 200, 500, 1000, or more residues. A polynucleotide or polypeptide 
sequence is typically from a mammal including, but not limited to, primate, e.g., human; 
rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or other mammal. These terms 

1 5 include both naturally occurring or recombinant forms. 

[0030] The terms "C20orfl29 protein" or "C20orfl29 polynucleotide" or refer to nucleic 
acid and polypeptide polymorphic variants, alleles, mutants, and interspecies homologues of 
SEQ ID NO: 21 or SEQ ID NO: 22. Typically such genes or proteins have a sequence that 
has greater than about 70% nucleotide sequence identity, usually 80%, 85%, 90% or 99% or 
20 greater sequence identity to SEQ ID NO: 21 or SEQ ID NO: 22, preferably over a region of 
over a region of at least about 25, 50, 100, 200, 500, 1000, or more residues. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. These terms include both naturally occurring or recombinant forms. 

25 [003 11 The terms "C20orf52 protein" or "C20orf52 polynucleotide" or refer to nucleic acid 
and polypeptide polymorphic variants, alleles, mutants, and interspecies homologues of SEQ 
ED NO: 23 or SEQ ID NO: 24. Typically such genes or proteins have a sequence that has 
greater than about 70% nucleotide sequence identity, usually 80%, 85%, 90% or 99% or 
greater sequence identity to SEQ ID NO: 23 or SEQ ID NO: 24, preferably over a region of 

30 over a region of at least about 25, 50, 100, 200, 500, 1000, or more residues. A 

polynucleotide or polypeptide sequence is typically from a mammal including, but not 
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limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. These terms include both naturally occurring or recombinant forms. 

[0032] The terms "C20orf20 protein" or "C20or£20 polynucleotide" or refer to nucleic acid 
and polypeptide polymorphic variants, alleles, mutants, and interspecies homologues of SEQ 
ID NO: 25 or SEQ ID NO: 26. Typically such genes or proteins have a sequence that has 
greater than about 70% nucleotide sequence identity, usually 80%, 85%, 90% or 99% or 
greater sequence identity to SEQ ID NO: 25 or SEQ ID NO: 26, preferably over a region of 
over a region of at least about 25, 50, 100, 200, 500, 1000, or more residues. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. These terms include both naturally occurring or recombinant forms. 

[0033] The terms "C20orfl88 protein" or "C20orfl88 polynucleotide" or refer to nucleic 
acid and polypeptide polymorphic variants, alleles, mutants, and interspecies homologues of 
SEQ ID NO: 27 or SEQ ID NO: 28. Typically such genes or proteins have a sequence that 
has greater than about 70% nucleotide sequence identity, usually 80%, 85%, 90% or 99% or 
greater sequence identity to SEQ ID NO: 27 or SEQ ID NO: 28, preferably over a region of 
over a region of at least about 25, 50, 100, 200, 500, 1000, or more residues. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. These terms include both naturally occurring or recombinant fonns. 

[0034] A "biological sample" as used herein is a sample of biological tissue or fluid that 
contains nucleic acids or polypeptides, e.g., of a26#77, CPNE 1, ITGB4BP, RAE1, BMP7, 
GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 
protein, polynucleotide or transcript. Such samples include, but are not limited to, tissue 
isolated from humans, or rodents, e.g., mice, and rats. Biological samples may also include 
sections of tissues such as biopsy and autopsy samples, frozen sections taken for histologic 
purposes, blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. Biological 
samples also include explants and primary and/or transformed cell cultures derived from 
patient tissues. A biological sample is typically obtained from a eukaryotic organism, most 
preferably a mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, 
e.g., guinea pig, rat, mouse; rabbit; or a bird; reptile; or fish. Livestock and domestic animals 
are of particular interest. 
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[0035] Providing a biological sample" means to obtain a biological sample for use in 
methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 
5 methods of the invention in vivo. Archival tissues, having treatment or outcome history, will 
be particularly useful. 

(003$ The terms "identical" or percent "identity," in the context of two or more nucleic 
acids or polypeptide sequences, refer to two or more sequences or subsequences that are the 
same or have a specified percentage of amino acid residues or nucleotides that are the same 

10 (e,g., greater than about 60% amino acid sequence identity, 65%, 70%, 75%, 80%, 85%, 
. 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or higher identity over 
a specified region, when compared and aligned for maximum correspondence over a 
comparison window or designated region) as measured using a BLAST or BLAST 2.0 
sequence comparison algorithms with default parameters described below, or by manual 

1 5 alignment and visual inspection. Such sequences are then said to be "substantially identical." 
This definition also refers to, or may be applied to, the compliment of a test sequence. The 
definition also includes sequences that have deletions and/or additions, as well as those that 
have substitutions, as well as naturally occurring, e.g., polymorphic or allelic variants, and 
man-made variants. As described below, the preferred algorithms can account for gaps and 

20 the like. Preferably, identity exists over a region that is at least about 25 amino acids or 
nucleotides in length, or more preferably over a region that is 50-100 amino acids or 
nucleotides in length. 

[0037] For sequence comparison, typically one sequence acts as a reference sequence, to 
which test sequences are compared. When using a sequence comparison algorithm, test and 
25 reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, default 
program parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

30 [0038] A "comparison window*', as used herein, includes reference to a segment of one of 
the number of contiguous positions selected from the group consisting typically of from 20 to 
600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence 
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may be compared to a reference sequence of the same number of contiguous positions after 
the two sequences are optimally aligned. Methods of alignment of sequences for comparison 
are well-known in the art. Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (1981) Adv. AppL Math. 
5 2:482-489, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. 
Biol. 48:443-453, by the search for similarity method of Pearson and Lipman (1988) Proc. 
Nat'l. Acad. Sci. USA 85:2444-2448, by computerized implementations of these algorithms 
(GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 
10 visual inspection (see, e.g., Ausubel, et al. (eds. 1995 and supplements) Current Protocols in 
Molecular Biology Lippincott. 

[0039] The BLAST algorithm also performs a statistical analysis of the similarity between 
two sequences (see, e.g., Karlin and Altschul (1993) Proc. Nat'l Acad. Sci. USA 90:5873- 
5887). One measure of similarity provided by the BLAST algorithm is the smallest sum 

1 5 probability (P(N)), which provides an indication of the probability by which a match between 
two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 
is considered similar to a reference sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01, and most preferably less than about 0.001. Log values may be large 

20 negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 110, 150, 170, etc. 

[0040] An indication that two nucleic acid sequences or polypeptides are substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 
reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
25 polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described below. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 
same primers can be used to amplify the sequences. 

30 [0041] A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
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prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells, such as CHO, HeLa, and the like (see, e.g., the American Type Culture 
Collection catalog or web site). 

[0042] The terms "isolated," "purified," or biologically pure" refer to material that is 
5 substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 
chromatography. A protein or nucleic acid that is the predominant species present in a 
preparation is substantially purified. In particular, an isolated nucleic acid is separated from 

1 0 some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 
that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and 
most preferably at least 99% pure. "Purify* or "purification" in other embodiments means 

15 removing at least one contaminant from the composition to be purified. In this sense, 

purification does not require that the purified compound be homogenous, e.g., 100% pure. 

[0043] The terms "polypeptide," "peptide" and "protein" are used interchangeably herein to 
refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which 
one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
20 occurring amino acid, as well as to naturally occurring amino acid polymers, those containing 
modified residues, and non-naturally occurring amino acid polymers. 

[0044] The term "amino acid" refers to naturally occurring and synthetic amino acids, as 
well as amino acid analogs and amino acid mimetics that function similarly to the naturally 
occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 

25 code, as well as those amino acids that are later modified, e.g., hydroxyproline, 7- 

carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 
norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 

30 modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to 
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chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that functions similarly to a naturally occurring amino acid. 

[0045] Amino acids may be referred to herein by either their commonly known three letter 
symbols or by the one-letter symbols recommended by the IUP AC-IUB Biochemical 
5 Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

[0046] "Conservatively modified variants" applies to both amino acid and nucleic acid 
sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 

10 acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 
most proteins. For instance, the codons GCA, GCC, GCG, and GCU all encode the amino 
acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can 

15 be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 
polypeptide also describes silent variations of the nucleic acid. In certain contexts each 
codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and 

20 TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a 

functionally identical molecule. Accordingly, a silent variation of a nucleic acid which 
encodes a polypeptide is implicit in a described sequence with respect to the expression 
product, but not necessarily with respect to actual probe sequences. 

[0047] As to amino acid sequences, one of skill will recognize that individual substitutions, 
25 deletions, or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 
alters, adds, or deletes a single amino acid or a small percentage of amino acids in the 
encoded sequence is a "conservatively modified variant" where the alteration results in the 
substitution of an amino acid with a chemically similar amino acid. Conservative substitution 
tables providing functionally similar amino acids are well known in the art. Such 
30 conservatively modified variants are in addition to and do not exclude polymorphic variants, 
interspecies homologs, and alleles of the invention. Typically conservative substitutions for 
one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) 



14 



WO 2004/064601 




PCT/US2004/001153 



Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), 
Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine 
(S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton (1984) 
Proteins Freeman). 

[0048] "Nucleic aci<T or "oligonucleotide" or ''polynucleotide" or grammatical 
equivalents used herein means at least two nucleotides covalently linked together. 
Oligonucleotides are typically less than about 100 nucleotides in length. A nucleic acid of 
the present invention will generally contain phosphodiester bonds, although in some cases, 
nucleic acid analogs are included that may have at least one different linkage, e.g., 
phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite 
linkages (see Eckstein (1992) Oligonucleotides and Analogues: A Practical Approach Oxford 
University Press); and peptide nucleic acid backbones and linkages. For example, peptide 
nucleic acids (PNA) which includes peptide nucleic acid analogs can be used in the 
invention. 

[0049] The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be appreciated by those 
in the art, the depiction of a single strand also defines the sequence of the complementary 
strand; thus the sequences described herein also provide the complement of the sequence. 
The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the 
nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations 
of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine 
hypoxanthine, isocytosine, isoguanine, etc. 'Transcript" typically refers to a naturally 
occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 
"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g., the individual units of a peptide nucleic 
acid, each containing a base, are referred to herein as a nucleoside. 

[0050] A "label" or a "detectable moiety" is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, chemical, or other physical means. For 
example, useful labels include 32 P, fluorescent dyes, electron-dense reagents, enzymes (e.g., 
as commonly used in an ELIS A), biotin, digoxigenin, or haptens and proteins or other entities 
which can be made detectable, e.g., by incorporating a radiolabel into the peptide or used to 
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detect antibodies specifically reactive with the peptide. The labels may be incorporated into 
the ovarian cancer nucleic acids, proteins and antibodies at any position. Any method known 
in the art for conjugating the antibody to the label may be employed, including those methods 
described by Hunter, et al. (1962) Nature 194:495-496; David, et al. (1974) Biochemistry 
13:1014-1021; Pain, et al. (1981) J. Immunol Meth. 40:219-230; andNygren (1982) J. 
Histochem. and Cytochem. 30:407-412. 

[0051] An "effector" or "effector moiety" or "effector component" is a molecule that is 
bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
non-covalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an 
antibody. The "effector" can be a variety of molecules including, e.g., detection moieties 
including radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such 
as epitope tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an 
antibiotic; or a radioisotope emitting "hard" e.g., beta radiation. 

[0052J A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or non-covalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe maybe 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

[0053] The term "probe" or a "nucleic acid probe", as used herein, is defined to be a 
collection of one or more nucleic acid fragments whose hybridization to a sample can be 
detected. The probe may be unlabeled or labeled as described below so that its binding to the 
target or sample can be detected. Particularly in the case of arrays, either probe or target 
nucleic acids may be affixed to the array. Whether the array comprises "probe" or "target" 
nucleic acids will be evident from the context. Similarly, depending on context, either the 
probe, the target, or both can be labeled. In some embodiments, the probe may be a member 
of an array of spotted nucleic acids. Techniques capable of producing high density arrays can 
also be used for this purpose (see, e.g., Fodor (1991) Science 767-773; Johnston (1998) Curr. 
Biol. 8: R171-R174; Schummer (1997) Biotechniques 23: 1087-1092; Kern (1997) 
Biotechniques 23: 120-124; U.S. Patent No. 5,143,854). One of skill will recognize that the 
precise sequence of the particular probes described herein can be modified to a certain degree 
to produce probes that are "substantially identical" to the disclosed probes, but retain the 
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ability to specifically bind to (i.e., hybridize specifically to) the same targets or samples as the 
probe from which they were derived. In addition, those of skill will recognize that a probe 
can specifically bind to all or a fragment of a target nucleic acid. Such modifications are 
specifically covered by reference to the individual probes described herein. 

{0054] The term "recombinant" when used with reference, e.g., to a cell, or nucleic acid, 
protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 
the introduction of a heterologous nucleic acid or protein Or the alteration of a native nucleic 
acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., recombinant 
cells express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 
polymerases and endonucleases, in a form not normally found in nature. In this maimer, 
operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 
form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, e.g., using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, e.g., through the expression of a recombinant nucleic 
acid as depicted above. 

[0055] The term "heterologous" when used with reference to portions of a nucleic acid 
indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 
coding region from another source. Similarly, a heterologous protein will often refer to two 
or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 
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[0056] A "promoter" is defined as an array of nucleic acid control sequences that direct 
transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 
active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
e.g., wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

(0057] An "expression vector" is a nucleic acid construct, generated recombinantly or 
synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 

[0058] The phrase "selectively (or specifically) hybridizes to" refers to the binding, 
duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
total cellular or library DNA or RNA). 

[0059] The phrase "stringent hybridization conditions" refers to conditions under which a 
probe will hybridize to its target subsequence, typically in a complex mixture of nucleic 
acids, but to no other sequences. Stringent conditions are sequence-dependent and will be 
different in different circumstances. Longer sequences hybridize specifically at higher 
temperatures. An extensive guide to the hybridization of nucleic acids is found in "Overview 
of principles of hybridization and the strategy of nucleic acid assays" in Tijssen (1993) 
Hybridization with Nucleic Probes (Laboratory Techniques in Biochemistry and Molecular 
Biology), (vol. 24) Elsevier. Generally, stringent conditions are selected to be about 5-10° C 
lower than the thermal melting point (Tin) for the specific sequence at a defined ionic 
strength pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic 
concentration) at which 50% of the probes complementary to the target hybridize to the target 
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sequence at equilibrium (as the target sequences are present in excess, at Tm, 50% of the 
probes are occupied at equilibrium). Stringent conditions will be those in which the salt 
concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion 
concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for 
5 short probes (e.g., 10 to 50 nucleotides) and at least about 60° C for long probes (e.g., greater 
than 50 nucleotides). Stringent conditions may also be achieved with the addition of 
destabilizing agents such as formamide. For selective or specific hybridization, a positive 
signal is typically at least two times background, preferably 10 times background 
hybridization. Exemplary stringent hybridization conditions can be as following: 50% 

10 formamide, 5x SSC, and 1% SDS, incubating at 42° C, or, 5x SSC, 1% SDS, incubating at 
65° C, with wash in 0.2x SSC, and 0.1% SDS at 65° C. For PCR, a temperature of about 36° 
C is typical for low stringency amplification, although annealing temperatures may vary 
between about 32-48° C depending on primer length. For high stringency PCR amplification, 
a temperature of about 62° C is typical, although high stringency annealing temperatures can 

15 range from about 50° C to about 65° C, depending on the primer length and specificity. 
Typical cycle conditions for both high and low stringency amplifications include a 
denaturation phase of 90-95° C for 30-120 sec, an annealing phase lasting 30-120 sec, and an 
extension phase of about 72° C for 1-2 min. Protocols and guidelines for low and high 
stringency amplification reactions are available, e.g., in Innis, et al. (1990) PCR Protocols: A 

20 Guide to Methods and Applications Academic Press, N.Y. 

[0060] "Inhibitors", and "modulators" of polynucleotides of the invention are used to refer 
to molecules or agents that inhibit or modulate the oncogenic effects of the proteins described 
here. Such agents can be identified using in vitro and in vivo assays described below. 
Inhibitors are compounds that, e.g., bind to, partially or totally block activity, decrease, 

25 prevent, delay activation, inactivate, desensitize, or down regulate the activity or expression 
of the 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, 
ADNP, C20orfl29, C20orf52, C20or£20, or C20orfl88 proteins described here. Such agents 
include, for example, antisense or inhibitory RNAs which inhibit expression of the 26#77, 
CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EEF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 

30 C20orf52, C20orf20, or C20orfl88 gene. Inhibitors also include antibodies that bind 
specifically to 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, 
PSMA7, ADNP, C20orfl29, C20orf52, C20or£20, or C20orfl88 proteins. In some 
embodiments, humanized antibodies are used as inhibitors, and can be used therapeutically. 
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Assays for inhibitors include, e.g., expressing the 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, 
GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 
protein in vitro, in cells, or cell membranes, applying test compounds, and then determining 
the functional effects on activity (e.g., changes in growth of the cell). Changes in cell growth 
could be any property associated with a neoplastic phenotype, for example, cell viability, 
formation of foci, anchorage independence, semi-solid or soft agar growth, change in contact 
inhibition or density limitation of growth, loss of growth factor or serum requirements, 
change in cell morphology, gain or loss of immortalization, gain or loss of tumor specific 
markers, ability to form or suppress tumors when injected into suitable animal hosts, and/or 
immortalization of the cell. 

[0061] 'Tumor cell" refers to pre-cancerous, cancerous, and normal cells in a tumor. 

[0062] "Cancer cells," ''transformed" cells or "transformation" in tissue culture, refers to 
spontaneous or induced phenotypic changes that do not necessarily involve the uptake of new 
genetic material. Although transformation can arise from infection with a transforming virus 
and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is typically associated with phenotypic changes, such as immortalization of 
cells, aberrant growth control, non-morphological changes, and/or malignancy. 

[0063] As used herein, "antibody 11 includes reference to an immunoglobulin molecule 
immunologically reactive with a particular antigen, and includes both polyclonal and 
monoclonal antibodies. The term also includes genetically engineered forms such as 
chimeric antibodies (e.g., humanized murine antibodies) and heteroconjugate antibodies (e.g., 
bispecific antibodies). The term "antibody" also includes antigen binding forms of 
antibodies, including fragments with antigen-binding capability (e.g., Fab', F(ab')2, Fab, Fv 
and rlgG. See also, Pierce Catalog and Handbook, 1994-1995 (Pierce Chemical Co., 
Rockford, EL). See also, e.g., Kuby, J., Immunology, 3 rd Ed., W.H. Freeman & Co., New 
York (1998). The term also refers to recombinant single chain Fv fragments (scFv). The 
term antibody also includes bivalent or bispecific molecules, diabodies, triabodies, and 
tetrabodies. Bivalent and bispecific molecules are described in, e.g., Kostelny et ah. (1992) J 
Immunol 148:1547, Pack and Pluckthun (1992) Biochemistry 31:1579, Hollinger et al, 1993, 
supra, Gruber et al. (1994) J Immunol :5368, Zhu et al (1997) Protein Sci 6:781, Hu et al. 



20 



WO 2004/064601 PCT/US2004/001153 



(1996) Cancer Res. 56:3055, Adams et al. (1993) Cancer Res. 53:4026, and McCartney, et 
al. (1995) Protein Eng. 8:301. 

[0064J An antibody immunologically reactive with a particular antigen can be generated by 
recombinant methods such as selection of libraries of recombinant antibodies in phage or 
5 similar vectors, see, e.g., Huse et al, Science 246: 1275-1281 (1989); Ward et al, Nature 
341:544-546 (1989); and Vaughan et al, Nature Biotech. 14:309-314 (1996), or by 
immunizing an animal with the antigen or with DNA encoding the antigen. 

[0065] Typically, an immunoglobulin has a heavy and light chain. Each heavy and light 
chain contains a constant region and a variable region, (the regions are also known as 

1 0 "domains"). Light and heavy chain variable regions contain four "framework" regions 
interrupted by three hypervariable regions, also called "complementarity-determining 
regions" or "CDRs". The extent of the framework regions and CDRs have been defined. 
The sequences of the framework regions of different light or heavy chains are relatively 
conserved within a species. The framework region of an antibody, that is the combined 

15 framework regions of the constituent light and heavy chains, serves to position and align the 
CDRs in three dimensional space. 

[0066] The CDRs are primarily responsible for binding to an epitope of an antigen. The 
CDRs of each chain are typically referred to as CDR1, CDR2, and CDR3, numbered 

sequentially starting from the N-terminus, and are also typically identified by the chain in 

> 

20 which the particular CDR is located. Thus, a V H CDR3 is located in the variable domain of 
the heavy chain of the antibody in which it is found, whereas a V L CDR1 is the CDR1 from 
the variable domain of the light chain of the antibody in which it is found. 

[0067] References to "V H " or a "VH" refer to the variable region of an immunoglobulin 
heavy chain of an antibody, including the heavy chain of an Fv, scFv , or Fab. References to 
25 "V L " or a "VL" refer to the variable region of an immunoglobulin light chain, including the 
light chain of an Fv, scFv , dsFv or Fab. 

[0068] The phrase "single chain Fv" or "scFv" refers to an antibody in which the variable 
domains of the heavy chain and of the light chain of a traditional two chain antibody have 
been joined to form one chain. Typically, a linker peptide is inserted between the two chains 
30 to allow for proper folding and creation of an active binding site. 
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10069] A "chimeric antibody" is an immunoglobulin molecule in which (a) the constant 
region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector function 
and/or species, or an entirely different molecule which confers new properties to the chimeric 
5 ' antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
' region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

[0070] A "humanized antibody" is an immunoglobulin molecule which contains minimal 
( sequence derived from non-human immunoglobulin. Humanized antibodies include human 

10 i immunoglobulins (recipient antibody) in which residues from a complementary determining 
region (CDR) of the recipient are replaced by residues from a CDR of a non-human species 
S (donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity and 
capacity. In some instances, Fv framework residues of the human immunoglobulin are 
replaced by corresponding non-human residues. Humanized antibodies may also comprise 

1 5 residues which are found neither in the recipient antibody nor in the imported CDR or 

framework sequences. In general, a humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and all or substantially all of 
the framework (FR) regions are those of a human immunoglobulin consensus sequence. The 

20 humanized antibody optimally also will comprise at least a portion of an immunoglobulin 
constant region (Fc), typically that of a human immunoglobulin (Jones et al, Nature 
321:522-525 (1986); Riechmann et al, Nature 332:323-329 (1988); and Presta, Curr. Op. 
Struct Biol 2:593-596 (1992)). Humanization can be essentially performed following the 
method of Winter and co-workers (Jones et al, Nature 321:522-525 (1986); Riechmann et 

25 al, Nature 332:323-327 (1988); Verhoeyen et al, Science 239:1534-1536 (1988)), by 

substituting rodent CDRs or CDR sequences for the corresponding sequences of a human 
antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. Patent No. 
4,816,567), wherein substantially less than an intact human variable domain has been 
substituted by the corresponding sequence from a non-human species. 

30 [0071] "Epitope" or "antigenic determinant" refers to a site on an antigen to which an 

antibody binds. Epitopes can be formed both from contiguous amino acids or noncontiguous 
amino acids juxtaposed by tertiary folding of a protein. Epitopes formed from contiguous 
amino acids are typically retained on exposure to denaturing solvents whereas epitopes 
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formed by tertiary folding are typically lost on treatment with denaturing solvents. An 
epitope typically includes at least 3, and more usually, at least 5 or 8-10 amino acids in a 
unique spatial conformation. Methods of determining spatial conformation of epitopes 
include, for example, x-ray crystallography and 2-dimensional nuclear magnetic resonance. 
5 See, e.g., Epitope Mapping Protocols in Methods in Molecular Biology, Vol. 66, Glenn E. 
Morris, Ed (1996). 

BRIEF DESCRIPTION OF THE DRAWINGS 

[00721 Figure 1 illustrates a comparison of Northern and Western blots showing that both 
RNA and protein expression of 26#77 are higher in colorectal cancers with an amplified 
10 26#77 gene (T) compared to the patients' normal colorectal tissue (N). 

[0073] Figure 2 provides gene names, symbols, Unigene ID numbers, and accession 
numbers of reference DNA sequences for CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, 
EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, and C20orfl88. 

DETAILED DESCRIPTION 

1 5 [0074] This invention provides novel therapeutic and diagnostic methods for treatment and 
detection of cancer, as well as methods for screening for compositions which can be used to 
treat cancer. As shown below, the invention is based, at least in part, on the discovery that 
26#77 is overexpressed in colorectal and breast cancer cells; and that CPNE 1, ITGB4BP, 
RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, 

20 or C20orfl88 are overexpressed in colorectal cancer cells. The overexpression of these genes 
therefore facilitates progression of carcinogenesis. 

METHODS OF SCREENING FOR INCREASED COPY NUMBER OR 
OVEREXPRESSION OF GENES 

[0075] In one aspect, 26#77, CPNE 1 , ITGB4BP, RAE 1 , BMP7, GNAS, EIF2S2, 
25 DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 genes (or their 
expression levels) are detected in different patient samples for which either diagnosis or 
prognosis information is desired. For example, the presence of cancer is evaluated by a 
determination of the increased copy number of 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, 
GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 
30 genes in the patient. Methods of evaluating the presence and/or copy number of a particular 
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gene or to determine the presence or absence of polymorphisms in the gene are well known to 
those of skill in the art. For example, hybridization based assays can be used for these 
purposes. 

Hybridization-based assays 
5 [0076] Hybridization assays can be used to detect copy number of 26#77, CPNE 1, 
ITGB4BP, RAE1, BMP7, GNAS, EEF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20orf20, or C20orfl88 function. Hybridization-based assays include, but are not 
limited to, traditional "direct probe" methods such as Southern blots or in situ Ziybridization 
(e.g., FISH), and "comparative probe" methods such as comparative genomic hybridization 
1 0 (CGH). The methods can be used in a wide variety of formats including, but not limited to 
substrate- (e.g. membrane or glass) bound methods or array-based approaches as described 
below. 

[0077] In a typical in situ hybridization assay, cells or tissue sections are fixed to a solid 
support, typically a glass slide. If a nucleic acid is to be probed, the cells are typically 
1 5 denatured with heat or alkali. The cells are then contacted with a hybridization solution at a 
moderate temperature to permit annealing of labeled probes specific to the nucleic acid 
sequence encoding the protein. The targets (e.g., cells) are then typically washed at a 
predetermined stringency or at an increasing stringency until an appropriate signal to noise 
ratio is obtained. 

20 [0078] The probes are typically labeled, e.g. , with radioisotopes or fluorescent reporters. 
Preferred probes are sufficiently long so as to specifically hybridize with the target nucleic 
acid(s) under stringent conditions. The preferred size range is from about 200 bp to about 
1000 bases. 

[0079] In some applications it is necessary to block the hybridization capacity of repetitive 
25 sequences. Thus, in some embodiments, tRNA, human genomic DNA, or Cot-1 DNA is used 
to block non- specific hybridization. 

[0080] In comparative genomic hybridization methods a first collection of (sample) nucleic 
acids (e.g. from a possible tumor) is labeled with a first label, while a second collection of 
(control) nucleic acids (e.g. from a healthy cell/tissue) is labeled with a second label. The 
30 ratio of hybridization of the nucleic acids is determined by the ratio of the two (first and 

second) labels binding to each fiber in the array. Where there are chromosomal deletions or 
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multiplications, differences in the ratio of the signals from the two labels will be detected and 
the ratio will provide a measure of the copy number. 

[00811 Hybridization protocols suitable for use with the methods of the invention are 
described, e.g., in Albertson (1984) EMBO J. 3: 1227-1234; Pinkel (1988) Proc. Natl Acad. 
5 Scu USA 85: 9138-9142; EPO Pub. No. 430,402; Methods in Molecular Biology, Vol. 33: In 
Situ Hybridization Protocols, Choo, ed., Humana Press, Totowa, NJ (1994), etc. In one 
particularly preferred embodiment, the hybridization protocol of Pinkel et ah (1998) Nature 
Genetics 20: 207-211, or of Kallioniemi (1992) Proc. Natl Acad Set USA 89:5321-5325 
(1992) is used. 

10 [00821 A variety of nucleic acid hybridization formats are known to those skilled in the art. 
For example, common formats include sandwich assays and competition or displacement 
assays. Hybridization techniques are generally described in Hames and Higgins (1985) 
Nucleic Acid Hybridization, A Practical Approach, JKL Press; Gall and Pardue (1969) Proc. 
Natl. Acad. Set USA 63: 378-383; and John et al. (1969) Nature 223: 582-587. 

1 5 [0083] The sensitivity of the hybridization assays may be enhanced through use of a 
nucleic acid amplification system that multiplies the target nucleic acid being detected. 
Examples of such systems include the polymerase chain reaction (PCR) system and the ligase 
chain reaction (LCR) system. Other methods recently described in the art are the nucleic acid 
sequence based amplification (NASBAO, Cangene, Mississauga, Ontario) and Q Beta 

20 Replicase systems. 

[0084] Typically, labeled signal nucleic acids are used to detect hybridization. The labels 
may be incorporated by any of a number of means well known to those of skill in the art. 
Means of attaching labels to nucleic acids include, for example nick translation, or end- 
labeling by kinasing of the nucleic acid and subsequent attachment (ligation) of a linker 
25 joining the sample nucleic acid to a label (e.g., a fluorophore). A wide variety of linkers for 
the attachment of labels to nucleic acids are also known. In addition, intercalating dyes and 
fluorescent nucleotides can also be used. 

[0085] Detectable labels suitable for use in the present invention include any composition 
detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical 
30 or chemical means. Useful labels in the present invention include biotin for staining with 
labeled streptavidin conjugate, magnetic beads {e.g., Dynabeads™), fluorescent labels {e.g., 
fluorescein, texas red, rhodamine, green fluorescent protein, and the like, see, e.g., Molecular 
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Probes, Eugene, Oregon, USA), radiolabels (e.g., 3 H, l25 I, 35 S, 14 C, or 32 P), enzymes (e.g., 
horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and 
colorimetric labels such as colloidal gold (e.g., gold particles in the 40 -80 nm diameter size 
range scatter green light with high efficiency) or colored glass or plastic (e.g., polystyrene, 
5 polypropylene, latex, etc.) beads. Patents teaching the use of such labels include U.S. Patent 
Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241. 

[0086] The label may be added to the nucleic acids prior to, or after the hybridization. So 
called "direct labels" are detectable labels that are directly attached to or incorporated into the 
sample or probe nucleic acids prior to hybridization. Tn contrast, so called "indirect labels" 

10 are joined to the hybrid duplex after hybridization. Often, the indirect label is attached to a 
binding moiety that has been attached to the target nucleic acid prior to the hybridization. 
Thus, for example, the target nucleic acid maybe biotinylated before the hybridization. After 
hybridization, an avidin-conjugated fluorophore will bind the biotin bearing hybrid duplexes 
providing a label that is easily detected. For a detailed review of methods of labeling nucleic 

1 5 acids and detecting labeled hybridized nucleic acids see Laboratory Techniques in 

Biochemistry and Molecular Biology, Vol 24: Hybridization With Nucleic Acid Probes, P. 
Tijssen, ed. Elsevier, N.Y., (1993)). 

[0087] The methods of this invention are particularly well suited to array-based 
hybridization formats. For a description of one preferred array-based hybridization system 
20 seePinkel etal (1998) Nature Genetics, 20: 207-211. 

[0088] Arrays are a multiplicity of different "probe" or "target" nucleic acids (or other 
compounds) attached to one or more surfaces (e.g., solid, membrane, or gel). In a preferred 
embodiment, the multiplicity of nucleic acids (or other moieties) is attached to a single 
contiguous surface or to a multiplicity of surfaces juxtaposed to each other. 

25 [0089] In an array format a large number of different hybridization reactions can be run 
essentially "in parallel." This provides rapid, essentially simultaneous, evaluation of a 
number of hybridizations in a single "experiment". Methods of performing hybridization 
reactions in array based formats are well known to those of skill in the art (see, e.g., Pastinen 
(1997) Genome Res. 7: 606-614; Jackson (1996) Nature Biotechnology 14:1685; Chee (1995) 

30 Science 274: 610; WO 96/17958, Pinkel et al (1998) Nature Genetics 20: 207-21 1). 

[0090] Arrays, particularly nucleic acid arrays can be produced according to a wide variety 
of methods well known to those of skill in the art. For example, in a simple embodiment, 
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"low density" arrays can simply be produced by spotting (e.g. by hand using a pipette) 
different nucleic acids at different locations on a solid support (e.g. a glass surface, a 
membrane, etc.). 

[0091] The DNA used to prepare the arrays of the invention is not critical. For example the 
5 arrays can include genomic DNA, e.g. overlapping clones that provide a high resolution scan 
of a portion of the genome containing the desired gene, or of the gene itself. Genomic 
nucleic acids can be obtained from, e.g., HACs, MACs, YACs, BACs, PACs, Pis, cosmids, 
plasmids, inter- Alu PCR products of genomic clones, restriction digests of genomic clones, 
cDNA clones, amplification (e.g., PCR) products, and the like. 

1 0 [0092] Arrays can also be produced using oligonucleotide synthesis technology. Thus, for 
example, U.S. Patent No. 5,143,854 and PCT Patent Publication Nos. WO 90/15070 and 
92/10092 teach the use of light-directed combinatorial synthesis of high density 
oligonucleotide arrays. 

Amplification-based assays. 

15 [0093] In other embodiments, amplification-based assays can be used to measure 26#77, 
CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20or£20, or C20orfl88 gene copy number in a sample. In such amplification- 
based assays, the nucleic acid sequences act as a template in an amplification reaction (e.g. 
Polymerase Chain Reaction (PCR). In a quantitative amplification, the amount of 

20 amplification product will be proportional to the amount of template in the original sample. 
Comparison to appropriate (e.g. healthy tissue) controls provides a measure of the copy 
number. 

[0094] Methods of "quantitative" amplification are well known to those of skill in the art. 
For example, quantitative PCR involves simultaneously co-amplifying a known quantity of a 
25 control sequence using the same primers. This provides an internal standard that may be used 
to calibrate the PCR reaction. Detailed protocols for quantitative PCR are provided in Innis 
et al. (1990) PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc. 
N.Y.). The known nucleic acid sequence for the genes is sufficient to enable one of skill to 
routinely select primers to amplify any portion of the gene. 

30 [0095] Real time PCR is another amplification technique that can be used to determine 

gene copy levels or levels of mRNA expression. (See, e.g., Gibson et al., Genome Research 
6:995-1001, 1996; Heid et al, Genome Research 6:986-994, 1996). Real-time PCR is a 
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technique that evaluates the level of PCR product accumulation during amplification. This 
technique permits quantitative evaluation of mRNA levels in multiple samples. For gene 
copy levels, total genomic DNA is isolated from a sample. For mRNA levels, mRNA is 
extracted from tumor and normal tissue and cDNA is prepared using standard techniques. 
5 ' Real-time PCR can be performed, for example, using a Perkin Elmer/Applied Biosystems 
' (Foster City, Calif.) 7700 Prism instrument. Matching primers and fluorescent probes can be 
designed for genes of interest using, for example, the primer express program provided by 
Perkin Elmer/Applied Biosystems (Foster City, Calif.). Optimal concentrations of primers 
and probes can be initially determined by those of ordinary skill in the art, and control (for 

10 example, jS-actin) primers and probes may be obtained commercially from, for example, 
Perkin Elmer/Applied Biosystems (Foster City, Calif.). To quantitate the amount of the 
! specific nucleic acid of interest in a sample, a standard curve is generated using a control. 
Standard curves may be generated using the Ct values determined in the real-time PCR, 
which are related to the initial concentration of the nucleic acid of interest used in the assay. 

15 Standard dilutions ranging from 10-10 6 copies of the gene of interest are generally sufficient. 
In addition, a standard curve is generated for the control sequence. This permits 
standardization of initial content of the nucleic acid of interest in a tissue sample to the 
amount of control for comparison purposes. 

[0096] Other suitable amplification methods include, but are not limited to ligase chain 
20 reaction (LCR) (see Wu and Wallace (1989) Genomics 4: 560, Landegren et al. (1988) 
Science 241: 1077, and Barringer et al. (1990) Gene 89: 1 17, transcription amplification 
(Kwoh et al. (1989) Proc. Natl Acad, Set USA 86: 1 173), self-sustained sequence replication 
(Guatelli et al (1990) Proc. Nat. Acad. Set USA 87: 1874), dot PCR, and linker adapter PCR, 
etc. 

25 Detection of gene expression 

[0097] 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, 
ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 gene expression level can also be 
assayed as a marker for cancer. In preferred embodiments, activity of the 26#77 gene is 
determined by a measure of gene transcript (e.g. mRNA), by a measure of the quantity of 

30 translated protein, or by a measure of gene product activity. In additional embodiments, 

activity of a CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 
C20orfl29, C20orf52, C20or£20, or C20orfl88 gene is determined by a measure of gene 
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transcript {e.g. mRNA), by a measure of the quantity of translated protein, or by a measure of 
gene product activity. 

[0098] Methods of detecting and/or quantifying the gene transcript (mRNA or cDNA) 
using nucleic acid hybridization techniques are known to those of skill in the art (see 
5 Sambrook et al. supra). For example, one method for evaluating the presence, absence, or 
q uantity of mRNA involves a Northern blot transfer. 

[0099] The probes can be full length or less than the full length of the nucleic acid 
sequence encoding the protein. Shorter probes are empirically tested for specificity. 
Preferably nucleic acid probes are 20 bases or longer in length. (See Sambrook et al. for 
10 methods of selecting nucleic acid probe sequences for use in nucleic acid hybridization.) 

Visualization of the hybridized portions allows the qualitative determination of the presence 
or absence of mRNA. 

[0100] In another preferred embodiment, a transcript {e.g., mRNA) can be measured using 
amplification {e.g. PCR) based methods as described above for directly assessing copy 
1 5 number of DNA. In a preferred embodiment, transcript level is assessed by using reverse 
transcription PCR (RT-PCR). In another preferred embodiment, transcript level is assessed 
by using real-time PCR. 

[0101] The expression level of an 26#77 gene can also be detected and/or quantified by 
detecting or quantifying the expressed 26#77 polypeptide. Similarly, the expression level of 

20 a CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 

C20orfl29, C20orf52, C20or£20, or C20orfl88 gene can also be detected and/or quantified 
by detecting or quantifying the expressed CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, 
EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 
polypeptide. The polypeptide can be detected and quantified by any of a number of means 

25 well known to those of skill in the art. These may include analytic biochemical methods such 
as electrophoresis, capillary electrophoresis, high performance liquid chromatography 
(HPLC), thin layer chromatography (TLC), hyperdiffusion chromatography, and the like, or 
various immunological methods such as fluid or gel precipitin reactions, immunodiffusion 
(single or double), immunoelectrophoresis, radioimmunoassay (RIA), enzyme-linked 

30 immunosorbent assays (ELISAs), immunofluorescent assays, western blotting, and the like. 
Immunohistochemical methods can also be used to detect 26#77, CPNE 1, ITGB4BP, RAE1, 
BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20or£20, or 
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C20orfl88 protein. With immunohistochemical staining techniques, a cell sample is 
prepared, typically by dehydration and fixation, followed by reaction with labeled antibodies 
specific for the gene product coupled, where the labels are usually visually detectable, such as 
enzymatic labels, fluorescent labels, luminescent labels, and the like. A particularly sensitive 
staining technique suitable for use in the present invention is described by Hsu et al. (1980) 
Am. J. Clin. Path. 75:734-738. The isolated proteins can also be sequenced according to 
standard techniques to identify polymorphisms. 

[0102] The 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, 
PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 polypeptide is detected 
and/or quantified using any of a number of well recognized immunological binding assays 
{see, e.g., U.S. Patents 4,366,241; 4,376,110; 4,517,288; and 4,837,168). For areview of the 
general immunoassays, see also Asai (1993) Methods in Cell Biology Volume 37: Antibodies 
in Cell Biology, Academic Press, Inc. New York; Stites & Terr (1991) Basic and Clinical 
Immunology 7th Edition. 

[0103] Immunological binding assays (or immunoassays) typically utilize a "capture agent" 
to specifically bind to and often immobilize the analyte (polypeptide or subsequence). The 
capture agent is a moiety that specifically binds to the analyte. In a preferred embodiment, 
the capture agent is an antibody that specifically binds a polypeptide. The antibody (anti- 
peptide) may be produced by any of a number of means well known to those of skill in the 
art. 

[0104] Immunoassays also often utilize a labeling agent to specifically bind to and label the 
binding complex formed by the capture agent and the analyte. The labeling agent may itself 
be one of the moieties comprising the antibody/analyte complex. Thus, the labeling agent 
may be a labeled polypeptide or a labeled anti-antibody. Alternatively, the labeling agent 
may be a third moiety, such as another antibody, that specifically binds to the 
antibody/polypeptide complex. 

[0105] In one preferred embodiment, the labeling agent is a second human antibody 
bearing a label. Alternatively, the second antibody may lack a label, but it may, in mm, be 
bound by a labeled third antibody specific to antibodies of the species from which the second 
antibody is derived. The second can be modified with a detectable moiety, e.g., as biotin, to 
which a third labeled molecule can specifically bind, such as enzyme-labeled streptavidin. In 
some embodiments, Western blot analysis is used to detected and or quantify 26#77, CPNE 



30 



WO 2004/064601 




PCT/US2004/001153 



1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20or£20, or C20orfl88 protein. 

[0106] Other proteins capable of specifically binding immunoglobulin constant regions, 
such as protein A or protein G may also be used as the label agent. These proteins are normal 
5 constituents of the cell walls of streptococcal bacteria. They exhibit a strong non- 

immunogenic reactivity with immunoglobulin constant regions from a variety of species {see, 
generally Kronval, et al (1973) J. Immunol, 111: 1401-1406, and Akerstrom (1985) J. 
Immunol, 135: 2589-2542). 

[0107] 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, 
1 0 ADNP, C20orfl29, C20or£52, C20orf20, or C20orfl 88 protein can be detected and/or 
quantified in cells using immunocytochemical or immunohistochemical methods. 1HC 
(immunohistochemistry) can be performed on paraffin-embedded tumor blocks using a 
26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 
C20orfl29, C20orf52, C20orf20, or C20orfl88-specific antibody. IHC is the method of 
1 5 colormetric or fluorescent detection of archival samples, usually paraffin-embedded, using an 
antibody that is placed directly on slides cut from the paraffin block. To detect and/or 
quantify 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, 
ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 in, for example tissue culture cells or 
cells from a subject that are not embedded in paraffin (for example, hematopoetic cells) ICC 
20 (immunocytochemistry) can be used. ICC is like IHC but uses fresh, non-paraffin embedded 
cells plated onto slides and then fixed and stained. 

[0108] Either polyclonal or monoclonal antibodies may be used in the immunoassays of the 
invention described herein. Polyclonal antibodies are preferably raised by multiple injections 
{e.g. subcutaneous or intramuscular injections) of substantially pure polypeptides or antigenic 
25 polypeptides into a suitable non-human mammal. The antigenicity of peptides can be 

determined by conventional techniques to determine the magnitude of the antibody response 
of an animal that has been immunized with the peptide. Generally, the peptides that are used 
to raise the anti-peptide antibodies should generally be those which induce production of high 
titers of antibody with relatively high affinity for the polypeptide. 

30 [0109] Preferably, the antibodies produced will be monoclonal antibodies ("mAb's"). For 
preparation of monoclonal antibodies, immunization of a mouse or rat is preferred. 
Polyclonal antibodies can also be used. 
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[01 10] It is also possible to evaluate an mAb to determine whether it has the same 
specificity as a mAb of the invention without undue experimentation by determining whether 
the mAb being tested prevents a mAb of the invention from binding to the subject gene 
product isolated as described above. If the mAb being tested competes with the mAb of the 
invention, as shown by a decrease in binding by the mAb of the invention, then it is likely 
that the two monoclonal antibodies bind to the same or a closely related epitope. Still another 
way to determine whether a mAb has the specificity of a mAb of the invention is to 
preincubate the mAb of the invention with an antigen with which it is normally reactive, and 
determine if the mAb being tested is inhibited in its ability to bind the antigen. If the mAb 
being tested is inhibited then, in all likelihood, it has the same, or a closely related, epitopic 
specificity as the mAb of the invention. 

[0111] The assays of this invention have immediate utility in detecting/predicting the 
likelihood of a cancer, in estimating survival from a cancer, in screening for agents that 
modulate the subject gene product activity, and in screening for agents that inhibit cell 
proliferation. 

METHODS OF SCREENING FOR GENE PRODUCT FUNCTION 
[0112] Assays for 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, 
PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 function can be designed to 
detect and/or quantify any effect that is indirectly or directly under the influence of the 
26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 
C20orfl29, C20orf52, C20or£20, or C20orfl88 protein or nucleic acid, e.g., a functional, 
physical, or chemical effect. Such assays can be used to test whether a biological sample 
comprises a functional 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, 
DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 protein, to test 
whether variant 26#77 polypeptides retain function, or to identify compounds that modulate 
26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 
C20orfl29, C20orf52, C20orf20, or C20orfl88 activity in cells. 

[0113] Typical assays useful in the present invention are those designed to test neoplastic 
characteristics of cancer cells. These assays include cell growth on soft agar; anchorage 
dependence; contact inhibition and density limitation of growth; cellular proliferation; cell 
death (apoptosis); cellular transformation; growth factor or serum dependence; tumor specific 
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marker levels; invasiveness into Matrigel; tumor growth and metastasis in vivo; mKNA and 
protein expression in cells undergoing metastasis, and otheT characteristics of cancer cells. 

[0114] The ability of 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, 
DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 polynucleotides 
5 to promote cell growth can also be assessed by introducing the polynucleotides into in cells 
and assessing the growth of those cells in vitro or in vivo. 

[0115] Assays may include those designed to test the ability of test agents to bind the 
26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 
C20orfl29, C20orf52, C20orf20, or C20orfl88 protein and thereby modulate its activity. 
1 0 Virtually any agent can be tested in such an assay. Such agents include, but are not limited to 
antibodies, natural or synthetic nucleic acids, natural or synthetic polypeptides, natural or 
synthetic lipids, natural or synthetic small organic molecules, and the like. 

[0116] Proteins interacting with the peptide or with the protein encoded by the cDNA (e.g., 
26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 

1 5 C20orfl29, C20orf52, C20orf20, or C20orfl 88) can be isolated using a yeast two-hybrid 
system, mammalian two hybrid system, or phage display screen, etc. Targets so identified 
can be further used as bait in these assays to identify additional proteins that interact with 
26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 
C20orfl29, C20orf52, C20orf20, or C20orfl88 or are downstream of 26#77, CPNE 1, 

20 ITGB4BP, RAE1 , BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 

C20orf52, C20orf20, or C20orfl 88; which proteins are also targets for drug development 
(see, e.g., Fields et al, Nature 340:245 (1989); Vasavada etal, Proc. Nat'lAcad. Sci. USA 
88:10686 (1991); Fearon et al, Proc. Nat'l Acad. Sci. USA 89:7958 (1992); Dang et al, Mol. 
Cell. Biol. 11:954 (1991); Chien etal, Proc. Nat'lAcad. Sci. USA 9578 (1991); and U.S. 

25 Patent Nos. 5,283,173, 5,667,973, 5,468,614, 5,525,490, and 5,637,463). 

[01 17] Any of the assays for detecting 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, 
EEF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 binding 
are amenable to high throughput screening. High throughput assays for the presence, 
absence, or quantification of particular nucleic acids or protein products are well known to 
30 those of skill in the art. Similarly, binding assays and reporter gene assays are similarly well 
known. Thus, for example, U.S. Patent 5,559,410 discloses high throughput screening 
methods for proteins, U.S. Patent 5,585,639 discloses high throughput screening methods for 
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nucleic acid binding {i.e., in arrays), while U.S. Patents 5,576,220 and 5,541,061 disclose 
high throughput methods of screening for ligand/antibody binding. 

[0118] In addition, high throughput screening systems are commercially available {see, 
I e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
5 j Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
typically automate entire procedures including all sample and reagent pipetting, liquid 
dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
10 i detailed protocols for various high throughput systems. Thus, for example, Zymark Corp. 
I provides technical bulletins describing screening systems for detecting the modulation of 
gene transcription, ligand binding, and the like. 

RECOMBINANT PRODUCTION OF 26#77 POLYPEPTIDES 
[0119] The present invention also provides methods, reagents, and vectors useful for 
15 expression of 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, 

PSMA7, ADNP, C20orfl29, C20orf52, C20or£20, or C20orfl88 polypeptides and nucleic 
acids in vitro. In vitro expression is particularly useful for production of 26#77, CPNE 1, 
ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20or£20, or C20orfl88 polypeptides. 

20 [0120] Any number of well known host cells can be used for production of 26#77, CPNE 1 , 
ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20or£20, or C20orfl88 polypeptides. Host cells may be cultured cells, cell lines, 
cells in vivo, and the like. Host cells may be prokaryotic cells such as bacterial cells, {e.g., E. 
coli), or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells such as CHO, 

25 HeLa, and the like . 

[0121] The particular procedure used to introduce the nucleic acids into a host cell for 
expression of the 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, 
PSMA7, ADNP, C20orfl29, C20or£52, C20orf20, or C20orfl88 protein is not critical to the 
invention. Any of the well known procedures for introducing foreign nucleotide sequences 
30 into host cells in vitro may be used. These include the use of calcium phosphate transfection, 
electroporation, liposome-mediated transfection, injection and microinjection, ballistic 
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methods, viral particles, virosomes, immunoliposomes, polycation:nucleic acid conjugates, 
naked DNA, artificial virions, agent-enhanced uptake of DNA, and the like. 

[0122] In these embodiments of this invention, 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, 
GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20or£20, or C20orfl88 
5 nucleic acids are inserted into vectors using standard molecular biological techniques. 
Victors maybe used at multiple stages of the practice of the invention, including for 
subclxming nucleic acids encoding components of the 26#77, CPNE 1, ITGB4BP, RAE1, 
BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or 
C20orfl88 protein as well as additional elements controlling protein expression, vector 
1 0 selectability, etc. Vectors may also be used to maintain or amplify the nucleic acids, for 

example by inserting the vector into prokaryotic or eukaryotic cells and growing the cells in 
culture. In addition, vectors may be used to introduce and express nucleic acids into cells for 
therapeutic or experimental purposes. 

[0123J A variety of commercially or commonly available vectors and vector nucleic acids 
15 can be converted into a vector of the invention by cloning a nucleic acid encoding a 26#77, 
CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20orf20, or C20orfl88 protein of the invention into the commercially or 
commonly available vector. A variety of common vectors suitable for this purpose are well 
known in the art. 

20 [0124] In a typical embodiment, an 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, 
, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 

poynucleotide is placed under the control of a promoter. A nucleic acid is "operably linked" 
to a promoter when it is placed into a functional relationship with the promoter. For instance, 
a promoter or enhancer is operably linked to a coding sequence if it increases or otherwise 

25 regulates the transcription of the coding sequence. Similarly, a "recombinant expression 
cassette" or simply an "expression cassette" is a nucleic acid construct, generated 
recombinantly or synthetically, with nucleic acid elements that are capable of effecting 
expression of a structural gene in hosts compatible with such sequences. Expression cassettes 
include promoters and, optionally, introns, polyadenylation signals, and transcription 

30 termination signals. Typically, the recombinant expression cassette includes a nucleic acid to 
be transcribed (e.g., a nucleic acid encoding a desired polypeptide), and a promoter. 
Additional factors necessary or helpful in effecting expression may also be used as described 
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herein. For example, an expression cassette can also include nucleotide sequences that 
encode a signal sequence that directs secretion of an expressed protein from the host cell. 
Transcription termination signals, enhancers, and other nucleic acid sequences that influence 
gene expression, can also be included in an expression cassette. 

5 [0125] An extremely wide variety of promoters are well known, and can be used in the 
vectors of the invention, depending on the particular application. Ordinarily, the promoter 
selected depends upon the cell in which the promoter is to be active. Other expression 
control sequences such as ribosome binding sites, transcription termination sites and the like 
are also optionally included. For E. coli, example control sequences include the T7, trp, or 
1 0 lambda promoters, a ribosome binding site and preferably a transcription termination signal. 
For eukaryotic cells, the control sequences typically include a promoter which optionally 
includes an enhancer derived from immunoglobulin genes, S V40, cytomegalovirus, a 
retrovirus {e.g., an LTR based promoter) etc., and a polyadenylation sequence, and may 
include splice donor and acceptor sequences. 

1 5 [0126] For long-term, high-yield production of recombinant proteins, stable expression will 
often be desired. For example, cell lines which stably express a 26#77, CPNE 1, ITGB4BP, 
RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, 
or C20orfl88 protein can be prepared using expression vectors of the invention which 
contain viral origins of replication or endogenous expression elements and a selectable 

20 marker gene. Following the introduction of the vector, cells may be allowed to grow for 1-2 
days in an enriched media before they are switched to selective media. The purpose of the 
selectable marker is to confer resistance to selection, and its presence allows growth of cells 
which successfully express the introduced sequences in selective media. Resistant, stably 
transfected cells can be proliferated using tissue culture techniques appropriate to the cell 

25 type. An amplification step, e.g., by administration of methyltrexate to cells transfected with 
a DHFR gene according to methods well known in the art, can be included. 

KITS USE IN DIAGNOSTIC. RESEARCH. AND THERAPEUTIC APPLICATIONS 
[0127] For use in diagnostic, research, and therapeutic applications disclosed here, kits are 
also provided by the invention. In the diagnostic and research applications such kits may 
30 include any or all of the following: assay reagents, buffers, 26#77, CPNE 1, ITGB4BP, 

RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, 
or C20orfl88-specific nucleic acids or antibodies, hybridization probes and/or primers, and 
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the like. A therapeutic product may include sterile saline or another pharmaceutical^ 
acceptable emulsion and suspension base. 

[0128] In addition, the kits may include instructional materials containing directions (i.e., 
protocols) for the practice of the methods of this invention. While the instructional materials 
5 typically comprise written or printed materials they are not limited to such. Any medium 
capable of storing such instructions and communicating them to an end user is contemplated 
by this invention. Such media include, but are not limited to electronic storage media (e.g. f 
magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to internet sites that provide such instructional materials. 

1 0 [0129] The present invention also provides for kits for screening for modulators of 26#77, 
CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20orf20, or C20orfl88. Such kits can be prepared from readily available 
materials and reagents. For example, such kits can comprise one or more of the following 
materials: an 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, 

15 ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 polypeptide or polynucleotide, 
reaction tubes, and instructions for testing the desired 26#77, CPNE 1, ITGB4BP, RAE1, 
BMP7, GNAS, EBF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20or£20, or 
C20orfl88 function. 

[0130] A wide variety of kits and components can be prepared according to the present 
20 invention, depending upon the intended user of the kit and the particular needs of the user. 
Diagnosis would typically involve evaluation of a plurality of genes or products. The genes 
will be selected based on correlations with important parameters in disease which may be 
identified in historical or outcome data. 

THERAPEUTIC METHODS 

25 Administration of Inhibitors 

[0131] As noted above, inhibitors of the invention can be used to treat cancer and other 
diseases associated with pathological cellular proliferation. The compounds that inhibit 
26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, 
C20orfl29, C20orf52, C20orf20, or C20orfl88 activity can be administered by a variety of 

30 methods including, but not limited to parenteral (e.g., intravenous, intramuscular, 

intradermal, intraperitoneal, and subcutaneous routes), topical, oral, local, or transdermal 
administration. These methods can be used for prophylactic and/or therapeutic treatment. 
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The pharmaceutical compositions can be administered in a variety of unit dosage forms 
depending upon the method of administration. For example, unit dosage forms suitable for 
oral administration include powder, tablets, pills, capsules and lozenges. 

[0132] The compositions for administration will commonly comprise an inhibitor dissolved 
in a pharmaceutical^ acceptable carrier, preferably an aqueous carrier. A variety of aqueous 
carriers can be used, e.g., buffered saline and the like. These solutions are sterile and 
generally free of undesirable matter. These compositions may be sterilized by conventional, 
well known sterilization techniques. The compositions may contain pharmaceutically 
acceptable auxiliary substances as required to approximate physiological conditions such as 
pH adjusting and buffering agents, toxicity adjusting agents and the like, for example, sodium 
acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate and the like. 
The concentration of active agent in these formulations can vary widely, and will be selected 
primarily based on fluid volumes, viscosities, body weight and the like in accordance with the 
particular mode of administration selected and the patient's needs. 

[0133] Thus, a typical pharmaceutical composition for intravenous administration would be 
about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the drug is administered to a secluded site and not into 
the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenterally 
administrable compositions will be known or apparent to those skilled in the art and are 
described in more detail in such publications as Remington's Pharmaceutical Science, 15th 
ed., Mack Publishing Company, Easton, Pennsylvania (1980). 

[0134] The compositions containing inhibitors can be administered for therapeutic or 
prophylactic treatments. In therapeutic applications, compositions are administered to a 
patient suffering from a disease (e.g., colon cancer) in an amount sufficient to cure or at least 
partially arrest the disease and its complications. An amount adequate to accomplish this is 
defined as a "therapeutically effective dose." Amounts effective for this use will depend 
upon the severity of the disease and the general state of the patient's health. Single or 
multiple administrations of the compositions may be administered depending on the dosage 
and frequency as required and tolerated by the patient. In any event, the composition should 
provide a sufficient quantity of the agents of this invention to effectively treat the patient. 
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Polynucleotide inhibitors 

[0135] The activity of 26#77, CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, 
DNCL2A, PSMA7, ADNP, C20orfl29, C20orf52, C20orf20, or C20orfl88 protein can also 
be down-regulated, or entirely inhibited, by the use of antisense polynucleotides, e.g., a 
5 nucleic acid complementary to, and which can preferably hybridize specifically to, a 26#77, 
CPNE 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20or£20, or C20orfl88 encoding mRNA. Binding of the antisense 
polynucleotide to the mRNA reduces the translation and/or stability of the mRNA. 

[0136] Antisense polynucleotides can comprise naturally-occurring nucleotides, or 
1 0 synthetic species formed from naturally-occurring subunits or their close homologs. 

Antisense polynucleotides may also have altered sugar moieties or inter-sugar linkages. 
Exemplary among these are the phosphorothioate and other sulfur containing species which 
are known for use in the art. Analogs are comprehended by this invention so long as they 
function effectively to hybridize with the ovarian cancer protein mRNA. See, e.g., Isis 
15 Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

[0137] RNA interference is another mechanism to suppress gene expression in a sequence 
specific manner. See, e.g., Brumelkamp, et al. (2002) Sciencexpress (21March2002); Sharp 
(1999) Genes Dev. 13:139-141; and Cathew (2001) Curr. Op. Cell Biol 13:244-248. In 
mammalian cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have 
20 been shown to be effective at inducing an RNAi response. See. e.g., Elbashir, et al. (2001) 
Nature 411:494-498. 

[0138] Ribozymes can also be used to target and inhibit transcription of 26#77, CPNE 1, 
ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 
C20orf52, C20orf20, or C20orfl88 nucleotide sequences. A ribozyme is an RNA molecule 
25 that catalytically cleaves other RNA molecules. Different kinds of ribozymes have been 

described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, RNase 
P, and axhead ribozymes (see, e.g., Castanotto, et al. (1994) Adv. Pharmacol. 25: 289-317 for 
a general review of the properties of different ribozymes). 

[0139] The polynucleotide inhibitors can be introduced into a cancer cell by any of a 
30 number of well known techniques. For example, the polynucleotide inhibitors can be 

conjugated to a binding molecule, as described in WO 91/04753. Suitable binding molecules 
include, but are not limited to, molecules that bind cell surface receptors on the surface of the 
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target cancer cell. Preferably, conjugation of the binding molecule does not substantially 
interfere with the ability of the binding molecule to bind to its corresponding receptor, or 
block entry of the inhibitory polynucleotide into the cell. Alternatively, a polynucleotide 
inhibitor may be introduced into a cell containing the target nucleic acid sequence by 
formation of an polynucleotide-lipid complex. 

EXAMPLES 

The following examples are offered to illustrate, but not to limit the claimed 

invention. 

Example 1: Unknown gene 26#77 is amplified and overexpressed at the RNA and protein 
levels in primary human colorectal cancers. 

[0140] Chromosome 20q is amplified in approximately 60% of primary human colorectal 
cancers. However, no definitive gene target has been identified for the ampUcon in human 
colorectal cancers. 

[0141] Unknown gene 26#77 was originally identified by virtue of its RNA expression 
profile in a breast cancer cell line. Recombinant 26#77 protein was expressed and used to 
generate antibodies specific for the 26#77protein. 

[0142] Twelve breast and colorectal cancer cell lines were tested for 26#77 DNA 
amplification and for RNA and protein levels. The 26#77 gene was amplified in three of 
twelve breast and colorectal cancer cell lines tested by Southern blot analysis or FISH. 
Northern blot analysis demonstrated that 26#77 RNA levels were elevated in nine of the 
twelve breast and colorectal cancer cell lines tested. 

[0143] The 26#77 protein was predominantly localized in the nucleus. A colorectal cancer 
cell line (CAC02 cell line) was fractionated into cytoplasmic and nuclear fractions. Western 
blot analysis with the ant 26#77 polyclonal antibody demonstrated that the majority of 26#77 
protein was found in the nuclear fraction. Immunocytochemical analysis of CAC02 cells 
also showed that 26#77 was predominantly localized in the nucleus. Similar results were 
obtained using a breast cancer cell line (BT474) that overexpressed 26#77. 

[0144] 26#77 was also cloned into a tetracycline-inducible vector (from Invitrogen). The tet- 
inducible 26#77 vector was then used to transfect NIH 3T3 cells. After induction of 26#77 
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expression, 26#77 was localized to the cell nucleus as demonstrated by western blot analysis 
and immunocytochemistry. 

[0145] One hundred and twenty-five primary colorectal cancers with the 20q ampUcon were 
te$ted for 26#77 gene copy levels. The 26#77 gene was amplified in 60% of the 125 
colorectal cancers tested by Southern blot analysis or FISH. A subset of the 125 primary 
colorectal cancers (40 samples total) were tested for 26#77 RNA and protein levels. Of the 
26#77 amplified colorectal cancers in the subset (20 cancers total), all had elevated levels of 
26#77 RNA compared to matched normal colorectal tissue as demonstrated by Northern blot 
analysis. Exemplary results are shown in Figure 1. Western blot analysis and 
immunohistochemistry demonstrated that 26#77 protein levels were also elevated in the 
samples. The results indicate that the 26#77 gene is a target of the 20q amplicon and is an 
important novel oncogene in human colorectal cancer. 

Example 2: Other gene are amplified and overex p ressed in primary human colorectal 
cancers. 

[0146] Thirteen additional genes that reside on the q-arm of chromosome 20 are amplified 
in approximately 60% of human colorectal cancers and have concurrent upregulation of their 
RNA. Most genes in colorectal cancer amplicons downregulate their RNA to maintain 
normal levels. These thirteen genes do not, and are therefore upregulated at both the DNA 
and RNA level and may contribute to the cancer phenotype; i.e., they may be targets of the 
amplification. The thirteen genes encode Copine I (CPN1), Integrin beta-4 binding protein 
(ITGB4BP), RNA Export I (RAE1), Bone morphogenic protein 7 (BMP7), GTP-binding 
protein, alpha-stimulatory (GNAS), eukaryotic translation initiation factor 2, subunit 2 
(EIF2S2), Dynein ligt chain A2, (DNCL2A), Proteosome subnit alpha-type 7 (PSMA7), 
Activity dependent neuroprotector (ADNP), C20ORF129, C20ORF52, C20ORF20, and 
C20ORF188. Accession numbers for the genes are found in Figure 2. 



It is understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview o 
this application and scope of the appended claims. All publications, patents, and patent 
applications cited herein are hereby incorporated by reference in their entirety for all 
purposes. 
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WHAT IS CLAIMED IS: 



1 1 . A method for detennining the presence or absence of a colorectal 

2 cancer cell in a patient, the method comprising determining the level of a target nucleic acid 

3 that encodes SEQ ID NO: 2 in a biological sample from the patient, thereby determining the 

4 presence or absence of the colorectal cancer cell in the patient. 

1 2. The method of claim 1, wherein the target nucleic acid comprises a 

2 sequence at least 80% identical to SEQ ID NO: 1. 

1 3. The method of claim 1, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 4. The method of claim 3, further comprising the step of amplifying 

2 nucleic acids before the step of detennining the level of the nucleic acid. 

1 5 . The method of claim 3, wherein the isolated nucleic acids are mRNA. 

1 6. The method of claim 1, wherein the biological sample is colorectal 

2 tissue and the step of detennining the level of target nucleic acid is carried out using in situ 

3 hybridization. 

1 7. The method of claim 1, wherein the step of determining the level of 

2 target nucleic acid is carried out using a labeled nucleic acid probe that selectively hybridizes 

3 to SEQ ID NO: 1 under stringent hybridization conditions. 

1 8 . The method of claim 1 , wherein the step of determining the level of 

2 target nucleic acid is carried out using a nucleic acid probe immobilized to a solid support, 

3 wherein the probe selectively hybridizes to SEQ ID NO: 1 under stringent hybridization 

4 conditions. 

1 9. The method of claim 1, wherein the step of determining the level of 

2 target nucleic acid is carried out using Northern blot analysis. 

1 10. The method of claim 1, wherein the step of determining the level of the 

2 target nucleic acid is carried out by comparing the amount of the target nucleic acid in the 

3 biological sample to the amount of the target nucleic acid in a reference sample. 
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1 11. The method of claim 10, wherein the reference sample is from normal 

2 colorectal tissue. 

1 12. The method of claim 1, wherein the patient is undergoing a therapeutic 

2 regimen to treat colorectal cancer. 

1 13. The method of claim 1, wherein the patient is suspected of having 

2 colorectal cancer. 

1 14. An isolated expression vector comprising a nucleic acid sequence that 

2 encodes SEQ ID NO: 2. 

1 15. The isolated expression vector of claim 14, wherein the nucleic acid 

2 sequencers at least 80% identical to SEQ ID NO: 1. 

1 16. A host cell comprising the expression vector of claim 14. 

1 1 7. A method for determining the presence or absence of a colorectal 

2 cancer cell in a patient, the method comprising determining the level of a target protein 

3 comprising a sequence as shown in SEQ ID NO: 2 in a biological sample from the patient, 

4 thereby determining the presence or absence of the colorectal cancer cell in the patient. 

1 18. The method of claim 17, wherein the step of determining the level of 

2 the target protein is carried out using an antibody. 

1 19. The method of claim 1 8, wherein the antibody is a monoclonal 

2 antibody. 

1 20. The method of claim 1 8, wherein the antibody is a polyclonal 

2 antibody. 

1 21 . The method of claim 1 8, wherein the antibody is labeled. 

1 22 The method of claim 2 1 , wherein the label is fluorescent. 
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1 23. The method of claim 17, wherein the step of determining the level of 

2 the target protein is carried out by comparing the amount of the target protein in the 

3 biological sample to the amount of the target protein in a reference sample. 

1 24. The method of claim 23, wherein the reference sample is from normal 

2 colorectal tissue. 

1 25. The method of claim 17, wherein the patient is undergoing a 

2 therapeutic regimen to treat colorectal cancer. 

1 26. The method of claim 17, wherein the patient is suspected of having 

2 colorectal cancer. 

1 27. A method for treating a cancer that overexpresses a 26#77 gene 

2 product comprising administering to a subject in need of such treatment a therapeutically 

3 effective amount of an inhibitor of 26#77 gene product. 

1 28. The method of claim 27, wherein the inhibitor of a 26#77 gene product 

2 is selected from the group consisting of an antisense RNA molecule, and an inhibitory RNA 

3 molecule. 

1 29. A method for determining the presence or absence of a colorectal 

2 cancer cell in a patient, the method comprising determining the level of a target nucleic acid 

3 that encodes SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, or 28 in a biological 

4 sample from the patient, thereby determining the presence or absence of the colorectal cancer 

5 cell in the patient. 

1 30. A method for determining the presence or absence of a colorectal 

2 cancer cell in a patient, the method comprising determining the level of a target protein 

3 comprising a sequence as shown in SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, or 

4 28 in a biological sample from the patient, thereby determining the presence or absence of the 

5 colorectal cancer cell in the patient. 

1 3 1 . A method for treating a cancer that overexpresses a Copine 1 (CPNE 

2 1) protein, the Integrin B4 binding protein (ITGB4BP), RNA Export homolog (RAE1), bone 
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3 morphogenic protein 7 (BMP7), G protein, alpha stimulating activity polypeptide 1 (GNAS), 

4 eukaryotic translation initiation factor 2, subunit 2 beta (EIF2S2), dynein light chain A2 

5 (DNCL2A), proteosome subunit arl (PSMA7), activity dependent neuroprotector (ADNP), 

6 C20orfl29, C20orf52, C20orf20, or C20orfl88 gene product comprising administering to a 

7 subject in need of such treatment a therapeutically effective amount of an inhibitor of CPNE 

8 1, ITGB4BP, RAE1, BMP7, GNAS, EIF2S2, DNCL2A, PSMA7, ADNP, C20orfl29, 

9 C20orf52, C20orf20, or C20orfl 88 gene product. 
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INFORMAL SEQUENCE LISTING 

SEQIDNOrl 

1 ctttgggggt ttgctgctgg ctctgactcc cgtcctgcga tgggttgcga cgggggaaca 

61 atccccaaga ggcatgaact ggtgaagggg ccgaagaagg ttgagaaggt cgacaaagat 
5 121 gctgaattag tggcccaatg gaactattgt actctaagtc aggaaatatt aagacgacca 

181 atagttgcct gtgaacttgg cagactttat aacaaagatg ccgtcattga atttctcttg 
241 gacaaatctg cagaaaaggc tcttgggaag gcagcatctc acattaaaag cattaagaat 
3 01 gtgacagagc tgaagctttc tgataatcct gcctgggaag gggataaagg aaacactaaa 
361 ggtgacaagc acgatgacct ccagcgggcg cgtttcatct gccccgttgt gggcctggag 
10 421 atgaacggcc gacacaggtt ctgcttcctt cggtgctgcg gctgtgtgtt ttctgagcga 

481 gccttgaaag agataaaagc ggaagtttgc cacacgtgtg gggctgcctt ccaggaggat 
541 gatgtcatcg tgctcaatgg caccaaggag gatgtggacg tgctgaagac aaggatggag 
601 gagagaaggc tgagagcgaa gctggaaaag aaaacaaaga aacccaaggc agcagagtct 
661 gtttcaaaac cagatgtcag tgaagaagcc ccagggccat caaaagttaa gacagggaag 
15 721 cctgaagacjg ccagccttga ttctagagag aagaaaacca acttggctcc caaaagcaca 

781 gcaatgaatg agagctcttc tggaaaagct gggaagcctc cgtgtggagc cacaaagagg 
841 tccatcgctg acagtgaaga atcggaggcc tacaagtccc tctttaccac tcacagctcc 
901 gccaagcgct ccaaggagga gtctgcccac tgggtcaccc acacgtccta ctgcttctga 
961 agcccgcact gccaccgctc ctgccccaga aggttgttta gtttccacgt aggcaggtcg 
20 1021 ctttgtgcct ctgagtgcgc tgctgtgtgt tctctctata gttctgcgtc ataaagctgt 

1081 cctggccagc cttcaagctg gtgtggccac tcttgatgtg aggcgtgtcg gttccagggg 
1141 ggacatggga ggggctgcac agtggcccga ggtcatgctt gcttccacct gcaggtgcat 

12 01 ttggtccttt ccatggccag gaagccctgt gggctgcact ttttatgctt gcagtacjcaa 
1261 gagactccag agtcctcacc ggtgcagagt tggcacatat taattaacta aaattctaat 

25 1321 gatcttgcta ccagcaataa atcaagtagg ccaagtgaaa ctgggcttta aaaaggatgg 

13 81 atttcaaata cactgtgccc actagaagct tcgaagggcc tcgtccctct gctacagccc 
1441 tgggaggagc caggatcctt gttggtctag ctaaatactg ttaggggagt gtgccccatc 
1501 tcatcatttc gaagatagca gagtcatagt tgggcacccg gtgattgggt tcaaaaataa 
1561 agctggtctg cctcttctca aaaaaaaaaa aaacjaaaaaa aaaaa 

30 

SEQ ID NO:2 

MGCDGGT I PKRHEL VKGP KKVE KVDKD AEL VAQWN YCTLS QE I L 
35 RRPIVACELGRLYNKDAVIEFLLDKSAEKALGKAA 

KGNTKGDKHDDLQRARFICPWGLEMNGRHRFCFLRCCGCVFSERALKEIKAEVCHTC 
GAAFQEDDVIVLNGTKEDVDVLKTRMEE 
GPSKVXTGKPEEASLDSREKKTNIAPKSTAMNESS 
AYKSLFTTHSSAKRSKEESAHWVTHTSYCF 

40 

SEQ ID NO: 3 

Copine 1 (CPNE1) BC001 142 

1 ggcgaaggct ttgtagagtt cagaaatgag gctgactata aggctgctct gtgtcgtcat 
45 61 aaacagtaca tgggcaatcg ctttattcaa gttcatccaa ttactaagaa aggtatgcta 

121 gaaaagatag atatgattcg aaaaagactg cagaacttca gctatgacca gagggaaatg 
181 atactaaatc cagaggggga tgtcaactct gccaaagtct gtgcccacat aacaaatatt 
241 ccattcagca ttacaaagat ggatgttctt cagttcctag aaggaatccc agtggatgaa 
3 01 aatgctgtac atgttcttgt tgataacaat gggcaaggtc taggacaggc attggttcag 
50 361 tttaaaaatg aagatgatgc acatggccca ctgcgtgacc ttggttcagc tgtccatttc 

421 ctgtgaccat ctcattgaca aggacatcgg ctccaagtct gacccactct gcgtcctttt 
481 acaggatgtg ggagggggca gctgggctga gcttggccgg actgaacggg tgcggaactg 
541 ctcaagccct gagttctcca agactctaca gcttgagtac cgctttgaga cagtccagaa 
601 gctacgcttt ggaatctatg acatagacaa caagacgcca gagctgaggg atgatgactt 
55 661 cctagggggt gctgagtgtt ccctaggaca gattgtgtcc agccaggtac tgactctccc 

721 cttgatgctg aagcctggaa aacctgctgg gcgggggacc atcacggtct cagctcagga 
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10 



15 



20 



25 



781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 



attaaaggac 
cttcctggga 
cctggtgtac 
agtccccgtt 
cgattatgac 
gctgcaggca 
aagctacaag 
ctttctggac 
tggctccaat 
caatgagtac 
caagctgttc 
tgaatttgcc 
ggatgcctac 
catcatcaac 
atacttcatg 
ggctgtggtg 
tgactttgag 
gcaggctgct 
tcgggaggca 
cagggcccag 
ggccccccag 
ggtccctctg 
tttgatactt 
aaccctcatt 



aatcgtgtag 
aaatcagatc 
agate tgagg 
cagcatttct 
agtgacgggt 
gtcccggctg 
aactctggaa 
tatgtgatgg 
ggagacccct 
ctgatggcac 
ectgeatttg 
ttgaatttca 
cgccaagccc 
catgtggcca 
ctgttgctgc 
cgtgcctcga 
gecatggage 
gcccgcgaca 
ttggcacaga 
ggttgggccc 
gectaggtte 
ggccacaacc 
ttatacttgt 
caataaagac 



taaccatgga 
catttctgga 
tcatcaagaa 
gtggtgggaa 
cacatgatct 
agtttgaatg 
ctatccgtgt 
gaggctgtca 
cctcacctga 
tgtggagtgt 
gatttggggc 
accccagtaa 
tgccccaagt 
ggtttgcagc 
tgactgatgg 
acctgcccat 
agetggaege 
ttgtgcagtt 
ccgtgctcgc 
cgctcaagcc 
ccttggaggc 
caacccttct 
ttctgctttt 
cagtgaagac 



ggtagaggee 
gttcttccgc 
caacctgaac 
ccccagcaca 
catcggtacc 
catccaccct 
caagatttgt 
gatcaacttc 
ctccctacac 
gggcagcgtg 
ccaggttccc 
cccctactgt 
tcgcctctat 
ccaggctgca 
tgctgtgacg 
gtcagtgatc 
tgatggtgga 
tgtaccctac 
agaagtgece 
acttccaccc 
tgtggcaagt 
cactctcctc 
getgetcttg 
caaaaaaaaa 



agaaacctag 
cagggtgatg 
cctacatgga 
cccatccagg 
ttccacacca 
gagaagcagc 
egggtagaaa 
actgtgggcg 
tacctgagtc 
gttcaggact 
cctgactggc 
gcaggcatcc 
ggccctacca 
catcagggga 
gatgtggaag 
attgtgggtg 
cccctgcata 
cgccggttcc 
acacaactgg 
teagecaagg 
cctcaatcct 
agtgctagca 
atcccacctt 



ataagaagga 
ggaaatggca 
agegtttetc 
tgeaatgetc 
gcttggccca 
agaaaaagaa 
cagagtactc 
tggacttcac 
caacaggggt 
atgactcaga 
aggtctcgea 
agggcattgt 
actttgcacc 
ctgcctcgca 
ccacacgtga 
tgggtggtgc 
cacgttctgg 
agaatgcccc 
tctcatactt 
atcctgcaca 
gtgtcccaga 
ctttgtattt 
tgctcctgac 



aaaaaaaaaa a 



SEQ ID NO: 4 

Copine 1 (CPNE1) BC001142 

/ 1 ran s 1 a t i on= 11 MAHCVTLVQL SIS CDHL IDKDIGS KSDPLCVLLQDVGGGS WAEL 
GRTERVRNCSSPEFSKTLQLEYRFETVQKLRFGIYDIDNKTPELRDDDFLGGAECSLG 

30 QIVSSQVLTLPLMLKPGKPAGRGTIWSAQEIiKDNRVVTMEVEARNIjDKKDFLGKSDP 
FliEFFRQGDGKWHIiVYRSEVIKNNLNPTWKRFSVPVQHFCGGNPSTPIQVQCSDYDSD 
GSHDLIGTFHTSLAQLQAVPAEFECIHPEKQQKKKSYKNSGTIRVKICRVETEYSFLD 
YVMGGCQINFTVGVDFTGSNGDPSSPDSLHYLSPTGVNEYLMALWSVGSWQDYDSDK 
LFPAFGFGAQVPPDWQVSHEFALNFNPSNPYCAGIQGIVDAYRQALPQVRLYGPTNFA 

35 PIINHVARFAAQAAHQGTASQYFMIiLLLTDGAVTDVEATREAVVRASNLPMSVIIVGV 
GGADFEAMEQLDADGGPIiHTRSGQAAARDIVQFVTYRRFQNAPREALAQTVLAEVPTQ 
LVSYFRAQGWAPLKPLPPSAKDPAQAPQA" 



SEQ ID NO: 5 
40 Integrin B4 bind, prot (ITGB4BP) AF047433 

1 aaeggaaace tttttaggga gtccaaggta cagtcgccgc gtgeggaget tgttactggt 

61 tacttggect catggcggtc egagcttegt tcgagaacaa ctgtgagatc ggctgctttg 
121 ccaagctcac caacacctac tgtctggtag egateggagg ctcagagaac ttctacagtg 
181 tgttcgaggg cgagctctcc gataccatcc ccgtggtgca cgcgtctatc gccggctgcc 

45 241 geatcategg gcgcatgtgt gtggggaaca ggcaeggtet cctggtaccc aacaatacca 

3 01 ccgaccagga gctgcaacac attegcaaca gcctcccaga cacagtgcag attaggeggg 
361 tggaggagcg gctctcagcc ttgggcaatg tcaccacctg caatgactac gtggccttgg 
421 tccacccaga cttggacagg gagacagaag aaattctggc agatgtgctc aaggtggaag 
481 tcttcagaca gacagtggcc gaccaggtgc tagtaggaag ctactgtgtc ttcagcaatc 

50 541 agggagggct ggtgcatccc aagacttcaa ttgaagacca ggatgagctg tcctctcttc 

601 ttcaagtccc ccttgtggcg gggactgtga accgaggcag tgaggtgatt gctgctggga 
661 tggtggtgaa tgactggtgt gccttctgtg gcctggacac aaccagcaca gagctgtcag 
721 tggtggagag tgtcttcaag ctgaatgaag cccagcctag caccattgcc accagcatgc 
781 gggattccct cattgacagc ctcacctgag tcaccttcca agttgttcca tgggctcctg 

55 841 gctctggact gtggccaacc ttctccacat tccgcccaat ctgtaccgga tgctggcagg 

901 gaggtggcag agagctcact gggactgagg ggctgggcac ccaacccttt tccacctgtg 



2/12 



WO 2004/064601 




PC17US2004/001153 



961 cttatcgcct ggatctatca ttactgcaaa aacctgctct gttgtgctgg ctggc,aggcc 
1021 ctgtggctgc tggctgaggg ttctgctgtc ctgtgccacc ccattaaagt gcagttccct 
1081 ccggaaaaaa aaaaaaaaaa aaaaaaaaaa aaaa 



5 SEQ ID NO: 6 

Integrin B4 bind, prot (ITGB4BP) AF047433 

/trans 1 at ion= 11 MAVRASFENNCE IGCFAKIiTNTYCLVAIGGSENFYSVFEGEIiSD 
TIPWHAS IAGCRI IGRMCVGNRHGLIiVPNNTTDQELQHIRNSIjPDTVQIRRVEERLS 
ALGNWTCNDYVALVHPDLDRETEEILAD^ 
1 0 VHPKTS IEDQDELSSLLQVPLVAGTVNRGSEVIAAGMVViroWCAFCGIiDTTSTELS W 
ESVFKLNEAQPSTIATSMRDSLIDSLT" 

SEQ ID NO: 7 

RNAExportHomol. (RAE1) U84720 

15 1 gcggtagtca gggcagtttc taccgcaggc ttaaggaggc ttcgggctcc tgggatttct 

61 gtccgcgctc ctggccctcg tccttcgcgc cagagcaggt tcgcaaactc ctcagaccct 
121 tctgctcccg gccgccgctt tccgccgggg cgagaccccc aggttcaaaa tgagcctgtt 
181 tggaacaacc tcaggttttg gaaccagtgg gaccagcatg tttggcagtg caactacaga 
241 caatcacaat cccatgaagg atattgaagt aacatcatct cctgatgata gcattggttg 
20 3 01 tctgtctttt agcccaccaa ccttgccggg gaactttctt attgcaggat catgggctaa 

3 61 tgatgttcgc tgctgggaag ttcaagacag tggacagacc attccaaaag cccagcagat 
421 gcacactggg cctgtgcttg atgtctgctg gagtgacgat gggagcaaag tgtttacggc 
481 atcgtgtgat aaaactgcca aaatgtggga cctcagcagt aaccaagcga tacagatcgc 
541 acagcatgat gctcctgtta aaaccatcca ttggatcaaa gctccaaact acagctgtgt 
25 601 gatgactggg agctgggata agactttaaa gttttgggat actcgatcgt caaatcctat 

661 gatggttttg caactccctg aaaggtgtta ctgtgctgac gtgatatacc ccatggctgt 
721 ggtggcaact gcagagaggg gcctgattgt ctatcagcta gagaatcaac cttctgaatt 
781 caggaggata gaatctccac tgaaacatca gcatcggtgt gtggctattt ttaaagacaa 
841 acagaacaag cctactggtt ttgccctggg aagtatcgag gggagagttg ctattcacta 
30 901 tatcaacccc ccgaaccccg ccaaagataa cttcaccttt aaatgtcatc gatctaatgg 

961 aaccaacact tcagctcctc aggacattta tgcggtaaat ggaatcgcgt tccatcctgt 
1021 tcatggcacc cttgcaactg tgggatctga tggtagattc agcttctggg acaaagatgc 
1081 cagaacaaaa ctaaaaactt cggaacagtt agatcagccc atctcagctt gctgtttcaa 
1141 tcacaatgga aacatatttg catacgcttc cagctacgac tggtcaaagg gacatgaatt 
35 1201 ttataatccc cagaaaaaaa attacatttt cctgcgtaat gcagccgaag agctaaagcc 

1261 caggaataag aagtagtggc tggagactct ggctcagcca gagttgtttc tctccactct 
1321 gcctcatctc tgtacgaatt tgggtcccag ccttgttggg ttgtcagcca tggacatgga 
13 81 tttcaacccc tggagaaaac gatgtcattg ttcagcagct gagagcccag gcgtccgcgg 
1441 cgacttgccg tctctccatt ccactgcctg ttgcagagtt tttctgtaac taagggggtt 
40 1501 gaggttattg tagacgttag attgcgggca ccgccaggga ttttgcagcg cttcagtgta 

1561 cgtgttagag aatattggaa aagcgtctgt gagccccgtg ctgtattttg taataaagtc 
1621 ttttgcagat tgaaaaaaaa aaaaaaaaaa 



SEQ ED NO: 8 
45 RNAExportHomol. (RAE1) U84720 

/translation "MSLFGTTSGFGTSGTSMFGSATTDNHNPMKDIEVTSSPDDSIGC 
LSFSPPTLPGNFLIAGSWAITOVRCWEVQDSGQTIPKAQQMHTGPVIjDVCWSDDGSKVF 
TAS CDKTAKMWDLS SNQAI Q I AQHDAPVKTIHWI KAPNYS CVMTG S WD KTL KFWDTRS 
SNPMMVLQLPERCYC^VIYPMAWATAER 
50 AIFKDKQNKPTGFALGSIEGRVAIHYINPPNPAKDNFTFKCHRSNGTNTSAPQDIYAV 
NGI AFHPVHGTLATVGSDGRFS FWDKDARTKLKTSEQIjDQP I S ACCFNHNGNI FAYAS 
SYDWSKGHEFYNPQKKNYIFIiRNAAEELKPRNKK" 
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SEQIDNO:9 

Bone morph. prot.7 (BMP7) BC008584 

1 gggcgcagcg gggcccgtct gcagcaagtg accgacggcc gggacggccg cctgccccct 

61 ctgccacctg gggcggtgcg ggcccggagc ccggagcccg ggtagcgcgt agagccggcg 
5 121 cgatgcacgt gcgctcactg cgagctgcgg cgccgcacag cttcgtggcg ctctgggcac 

181 ccctgttcct gctgcgctcc gccctggccg acttcagcct ggacaacgag gtgcactcga 
241 gcttcatcca ccggcgcctc cgcagccagg agcggcggga gatgcagcgc gagatcctct 
301 ccattttggg cttgccccac cgcccgcgcc cgcacctcca gggcaagcac aactcggcac 
361 ccatgttcat gctggacctg tacaacgcca tggcggtgga ggagggcggc gggc.ccggcg 
10 421 gccagggctt ctcctacccc tacaaggccg tcttcagtac ccagggcccc cctctggcca 

481 gcctgcaaga tagccatttc ctcaccgacg ccgacatggt catgagcttc gtcaacctcg 
541 tggaacatga caaggaattc ttccacccac gctaccacca tcgagagttc cggtttgatc 
601 tttccaagat cccagaaggg gaagctgtca cggcagccga attccggatc tacaaggact 
661 acatccggga acgcttcgac aatgagacgt tccggatcag cgtttatcag gtgctccagg 
15 721 agcacttggg cagggaatcg gatctcttcc tgctcgacag ccgtaccctc tgggcctcgg 

781 aggagggctg gctggtgttt gacatcacag ccaccagcaa ccactgggtg gtcaatccgc 
841 ggcacaacct gggcctgcag ctctcggtgg agacgctgga tgggcagagc atcaacccca 
901 agttggcggg cctgattggg cggcacgggc cccagaacaa gcagcccttc atggtggctt 
961 tcttcaaggc cacggaggtc cacttccgca gcatccggtc cacggggagc aaacagcgca 
20 1021 gccagaaccg ctccaagacg cccaagaacc aggaagccct gcggatggcc aacgtggcag 

1081 agaacagcag cagcgaccag aggcaggcct gtaagaagca cgagctgtat gtcagcttcc 
1141 gagacctggg ctggcaggac tggatcatcg cgcctgaagg ctacgccgcc tactactgtg 

12 01 agggggagtg tgccttccct ctgaactcct acatgaacgc caccaaccac gccatcgtgc 
1261 agacgctggt ccacttcatc aacccggaaa cggtgcccaa gccctgctgt gcgcccacgc 

25 1321 agctcaatgc catctccgtc ctctacttcg atgacagctc caacgtcatc ctgaagaaat 

13 81 acagaaacat ggtggtccgg gcctgtggct gccactagct cctccgagaa ttcagaccct 
1441 ttggggccaa gtttttctgg atcctccatt gctcgccttg gccaggaacc agcagaccaa 
1501 ctgccttttg tgagaccttc ccctccctat ccccaacttt aaaggtgtga gagtattagg 
1561 aaacatgagc agcatatggc ttttgatcag tttttcagtg gcagcatcca atgaacaaga 

30 1621 tcctacaagc tgtgcaggca aaacctagca ggaaaaaaaa acaacgcata aagaaaaatg 

1681 gccgggccag gtcattggct gggaagtctc agccatgcac ggactcgttt ccagaggtaa 
1741 ttatgagcgc ctaccagcca ggccacccag ccgtgggagg aagggggcgt ggcaaggggt 
1801 gggcacattg gtgtctgtgc gaaaggaaaa ttgacccgga agttcctgta ataaatgtca 
1861 caataaaacg aatgaatg 

35 

SEQIDNO: 10 

Bone morph. prot.7 (BMP7) BC008584 

/translation^ "MHVRSLRAAAPHSFVALWAPLFLIiRSALADFSLDNEVHSSFIHR 
RLRS QERREMQRE I LS ILGLPHRPRPHLQGKHNS APMFMLDLYNAMAVEEGGGPGGQG 

40 FSYPYKAVFSTQGPPLASLQDSHFLTDADMVMSFVl^VEHDKEFFHPRyHHREFRFDL 
SKIPEGEAVTAAEFRIYKDYIRERFDNETFRISVYQVLQEHLGRESDLFLLDSRTLWA 
SEEGWLVFDITATSNHWVVNPRHNLGLQLSVETLDGQSINPKLAGLIGRHGPQNKQPF 
MVAFFKATEVHFRSIRSTGSKQRSQNRSKTPKNQEALRMANVAENSSSDQRQACKKHE 
LYVSFRDLGWQDWI I APEGYAAYYCEGECAFPLNS YMNATNHAIVQTLVHF INPETVP 

45 KPCCAPTQLNAI S VLYFDDSSNVI LKKYRNMWRACGCH » 



SEQIDNO: 11 

G-prot, a-stimul. (GNAS) BC002722 

1 ccgccgccgc cgcagcccgg ccgcgccccg ccgccgccgc cgccgccatg ggctgcctcg 
50 61 ggaacagtaa gaccgaggac cagcgcaacg aggagaaggc gcagcgtgag gccaacaaaa 

121 agatcgagaa gcagctgcag aaggacaagc aggtctaccg ggccacgcac cgcctgctgc 
181 tgctgggtgc tggagaatct ggtaaaagca ccattgtgaa gcagatgagg atcctgcatg 
241 ttaatgggtt taatggagag ggcggcgaag aggacccgca ggctgcaagg agcaacagcg 
301 atggtgagaa ggcaaccaaa gtgcaggaca tcaaaaacaa cctgaaagag gcgattgaaa 
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361 ccattgtggc cgccatgagc aacctggtgc cccccgtgga gctggccaac cccgagaacc 
421 agttcagagt ggactacatc ctgagtgtga tgaacgtgcc tgactttgac ttccctcccg 
481 aattctatga gcatgccaag gctctgtggg aggatgaagg agtgcgtgcc tgctacgaac 
541 gctccaacga gtaccagctg attgactgtg cccagtactt cctggacaag atcgacgtga 
5 601 tcaagcaggc tgactatgtg ccgagcgatc aggacctgct tcgctgccgt gtcctgactt 

661 ctggaatctt tgagaccaag ttccaggtgg acaaagtcaa cttccacatg tttgacgtgg 
721 gtggccagcg cgatgaacgc cgcaagtgga tccagtgctt caacgatgtg actgccatca 
* ! 781 tcttcgtggt ggccagcagc agctacaaca tggtcatccg ggaggacaac cagaccaacc 

I 841 gcctgcagga ggctctgaac ctcttcaaga gcatctggaa caacagatgg ctgcgcacca 

10 901 tctctgtgat cctgttcctc aacaagcaag atctgctcgc tgagaaagtc cttgctggga 

961 aatcgaagat tgaggactac tttccagaat ttgctcgcta cactactcct gaggatgcta 
1021 ctcccgagcc cggagaggac ccacgcgtga cccgggccaa gtacttcatt cgagatgagt 
1081 ttctgaggat cagcactgcc agtggagatg ggcgtcacta ctgctaccct catttcacct 
1141 gcgctgtgga cactgagaac atccgccgtg tgttcaacga ctgccgtgac atcattcagc 
15 » 12 01 gcatgcacct tcgtcagtac gagctgctct aagaagggaa cccccaaatt taattaaagc 

12 61 cttaagcaca attaattaaa agtgaaacgt aattgtacaa gcagttaatc acccaccata 
1 1321 gggcatgatt aacaaagcaa cctttccctt cccccgagtg attttgcgaa accccctttt 

,| 1381 cccttcagct tgcttagatg ttccaaattt agaaagctta aggcggccta cagaaaaagg 

1441 aaaaaaggcc acaaaagttc cctctcactt tcagtaaaaa taaataaaac agcagcagca 
20 1501 aacaaataaa atgaaataaa agaaacaaat gaaataaata ttgtgttgtg cagcattaaa 

1 1561 aaaaatcaaa ataaaaatta aatgtgagca aag 

SEQ ID NO: 12 

G-prot., a-stimul. (GNAS) BC002722 

25 / t r ans lat ion= " MGCLGNSKTEDQRNEEKAQREANKKIEKQLQKDKQVYRATHRLL 
LliGAGESGKSTIVKQMRILHVNGFNGEGGEEDPQAARSNSDGEKATKVQDIKNNLKEA 
IETIVAAMSNLVPPVEIANPENQFRVDYILSVMl^PDFDFPPEFYEHAKALWEDEGVR 
ACYERSNEYQLIDCAQYFLDKIDVIKQADYVPSDQDLLRCRVLTSGIFETKFQVDKVN 
FHMFDVGGQRDERRKWIQCFNDVTAI IFVVAS SSYNMVIREDNQTNRLQEALNLFKS I 

30 WNNRWLRTISVILFLNKQDLLAEKVLAGKSKIEDYFPEFARYTTPEDATPEPGEDPRV 
TRAKYFIRDEFLRISTASGDGRHYCYPHFTCAVDTENIRRVFNDCRDIIQRMHLRQYE 
LL" 

SEQ ID NO: 13 
35 Euk. Transl.iriit.2, subunit 2 (EIF2S2) BC000934 

1 ggggtgtcgt ttcctttcgc tgatgcaaga gcctagtgcg gtggtgggag aggtatcggc 

61 aggggcagcg ctgccgccgg ggcctggggc tgacccgtct gacttcccgt ccgtgccgag 
121 cccactcgag ccgcagccat gtctggggac gagatgattt ttgatcctac tatgagcaag 
181 aagaaaaaga agaagaagaa gccttttatg ttagatgagg aaggggatac ccaaacagag 
40 241 gaaacccagc cttcagaaac aaaagaagtg gagccagagc caactgagga caaggatttg 

301 gaagctgatg aagaggacac taggaaaaaa gatgcttctg atgatctaga tgacttgaac 
361 ttctttaatc aaaagaaaaa gaagaaaaaa actaaaaaga tatttgatat tgatgaagct 
421 gaagaaggtg taaaggatct taagattgaa agtgatgttc aagaaccaac tgaaccagag 
481 gatgaccttg acattatgct tggcaataaa aagaagaaaa agaagaatgt taagttccca 
45 541 gatgaggatg aaatactaga gaaagatgaa gctctagaag atgaagacaa caaaaaagat 

601 gatggtatct cattcagtaa tcagacaggc cctgcttggg caggctcaga aagagactac 
661 acatacgagg agctgctgaa tcgagtgttc aacatcatga gggaaaagaa tccagatatg 
721 gttgctgggg agaaaaggaa atttgtcatg aaacctccac aagtcgtccg agtaggaacc 
781 aagaaaactt cttttgtcaa ctttacagat atctgtaaac tattacatcg tcagcccaaa 
50 841 catctccttg catttttgtt ggctgaattg ggtacaagtg gttctataga tggtaataac 

901 caacttgtaa tcaaaggaag attccaacag aaacagatag aaaatgtctt gagaagatat 
961 atcaaggaat atgtcacttg tcacacatgc cgatcaccgg acacaatcct gcagaaggac 
1021 acacgactct atttcctaca gtgcgaaact tgtcattcta gatgttctgt tgccagtatc 
1081 aaaaccggct tccaggctgt cacgggcaag cgagcacagc tccgtgccaa agctaactaa 
55 1141 tttgctaatc actgattttg caaagcttgt tgtggagatg tggctggaca ggtttgccat 

1201 cagagtggat ataccgttgt attaaaaaca agataaaaaa gctgccaaga tttttggcga 
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1261 gtggttggtc tgaagtcctt gcaagacgct gatgctcaag ctgttgacat actcattgcc 
1321 tactttaaca cctgtcagag aaacgtgata tggggtaagg aggtgctttt ttaaaatcgt 
1381 tcatagactt ctgtaaaatg caagataaat taaagttatt ataacagtga ttctttcaa 

5 SEQIDNO:14 

Euk. TransLinit.2, subunit 2 (EIF2S2) BC000934 

/translations "MSGDEMIFDPTMSKKKKKKKKPFMliDEEGDTQTEBTQPSETKEV 
KHSPT^KDLEADEEDTRKKDASDDIiDDIiNFFNQKKKKKX 
IPSEVQEPTEPEDDIjDIMLGNKKXKK^ 
1 0 BQTQPAWAGSERDYTYEELLNRVFNIMREKNPDMV^^ 

FVNFTD I CKLLHRQPKHLLAFLLAELGTSGS IDGNNQLVIKGRFQQKQ I ENVLRRYIK 
EYVTCHTCRSPDTILQKDTRIjYFLQCETCHSRCS VAS IKTGFQAVTGKRAQLRAKAN " 

SEQ ID NO: 15 
15 Dynein It Chain A2 (DNCL2A) NM014183 

1 cgcagaaagg cacaggactc gctaagtgtt cgctacgcgg ggctaccgga tcggtcggaa 

61 atggcagagg tggaggagac actgaagcga ctgcagagcc agaagggagt gcagggaatc 
121 atcgtcgtga acacagaagg cattcccatc aagagcacca tggacaaccc caccaccacc 
181 cagtatgcca gcctcatgca cagcttcatc ctgaaggcac ggagcaccgt gcgtgacatc 
241 gacccccaga acgatctcac cttccttcga attcgctcca agaaaaatga aattatggtt 
301 gcaccagata aagactattt cctgattgtg attcagaatc caaccgaata agccactctc 
361 ttggctccct gtgtcattcc ttaatttaat gccccccaag aatgttaatg tcaatcatgt 
421 cagtggacta gcacatggca gtcgcttgga acccactcac accaatccag tgaccgtgtg 
481 tgggctggcg gctcttctcc cccaccaacg gaacccctgt gtgcaccaac cttccccaga 
541 gctccggagc gccctctcct cacttccagg ttttggagca agagcttgca ggaagcccgc 
601 acccagcttc cttctgacct tcagttcact ttgtcgccct tggagaaagc tgtttttctt 
661 taactaaaaa taaccaaaat gcttaaaaaa aaaaaaaaaa aa 

SEQ ID NO: 16 
30 Dynein It. Chain A2 (DNCL2A) NM014183 

/trans lation= "MAEVEETLKRLQSQKGVQGIIVVNTEGIPIKSTMDNPTTTQYAS 
IjMHSFILKARSTVRDIDPQNDLTFLRIRSKKNEIMVAPDKDYFLIVIQNPTE" 



SEQ ID NO: 17 

35 Proteosome subunit ct-7 (PSMA7) BC004427 

1 cggcgccgag ggtggggcgc gggcgtagtg gcgccgggag tcgcgggtgc gcgcgggccg 

61 tgagtgtgcg cttttgagag tcgcggcgga aggagcccgg ccgccgcccg ccggcatgag 
121 ctacgaccgc gccatcaccg tcttctcgcc cgacggccac ctcttccaag tggagtacgc 
181 gcaggaggcc gtcaagaagg gctcgaccgc ggttggtgtt cgaggaagag acattgttgt 
241 tcttggtgtg gagaagaagt cagtggccaa actgcaggat gaaagaacag tgcggaagat 
301 ctgtgctttg gatgacaacg tctgcatggc ctttgcaggc ctcaccgccg atgcaaggat 
361 agtcatcaac agggcccggg tggagtgcca gagccaccgg ctgactgtgg aggacccggt 
421 cactgtggag tacatcaccc gctacatcgc cagtctgaag cagcgttata cgcagagcaa 
481 tgggcgcagg ccgtttggca tctctgccct catcgtgggt ttcgactttg atggcactcc 
541 taggctctat cagactgacc cctcgggcac ataccatgcc tggaaggcca atgccatagg 
601 tcggggtgcc aagtcagtgc gcgagttcct ggagaagaac tatactgacg aagccattga 
661 aacagatgat ctgaccatta agctggtgat caaggcactc ctggaagtgg ttcagtcagg 
721 tggcaaaaac attgaacttg ctgtcatgag gcgagatcaa tccctcaaga ttttaaatcc 
781 tgaagaaatt gagaagtatg ttgctgaaat tgaaaaagaa aaagaagaaa acgaaaagaa 
841 gaaacaaaag aaagcatcat gatgaataaa atgtctttgc ttgtaatttt taaattcata 
901 tcaatcatgg atgagtctcg atgtgtaggc ctttccattc catttattca cactgagtgt 
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961 cctacaataa acttccgtat tttt 



SEQIDNO: 18 

Proteosome subunit a-7 (PSMA7) BC004427 

/ 1 r ans 1 a t i on= " MS YDRAITVFS PDGHLFQVE YAQEAVKKGS TAVGVRGRD IVVLG 
VEKKSVAKLQDERTVRXICAXiDDNVCMAJFAGL 

TVEYITRYIASLKQRYTQSNGRRPFGISALIVGFDFDGTPRIiYQTDPSGTYHAWKANA 
IGRGAKSVREFLEKNYTDEAIETDDLTIKLVIKALIjEVVQSGGKNIEIiAVM 
ILNPEEIEKYVAEIEKEKEENEKKKQKKAS » 



SEQIDNO: 19 

Activity dep. Neuroprotector (ADNP) AF250860 

1 cggccgcggc gcgagccgga gtccgccgag ccggagcgcg acgaggcccc gggcgcgccc 

61 tccccgctgc cgccaccgcc gtgccgccgc catccgcccg ccgccgccgc cgctgtccgg 
121 cccccgagca cgccggcccc gcgcgcgcct cgaggccgag tcaaggtgtg agatgcacaa 
181 tgcgaaacct aggccccagc ttttacacca tgatgcgcag ggttgtactt tttgtactga 
241 actgataggt ggcctagtgg ttatgccctg tactaccatt ttgaggatct ggactccgtt 
3 01 tcctgccttg ctctttggac cacattgtca attcacaccg aaactatgtt ccaacttcct 

3 61 gtcaacaatc ttggcagttt aagaaaagcc cggaaaactg tgaaaaaaat acttagtgac 
421 attgggttgg aatactgtaa agaacatata gaagatttta aacaatttga acctaatgac 

4 81 ttttatttga aaaacactac atgggaggat gtaggactgt gggacccatc acttacgaaa 
541 aaccaggact atcggacaaa acctttctgc tgcagcgctt gtccattttc ctcaaaattc 
601 ttctctgcct acaaaagtca tttccgcaat gtccatagtg aagactttga aaataggatt 
6 61 ctccttaatt gcccctactg taccttcaat gcagacaaaa agactttgga aacacacatt 
721 aaaatatttc atgctccgaa cgccagcgca ccaagtagca gcctcagcac tttcaaagat 
781 aaaaacaaaa atgatggcct taaacctaag caggctgaca gtgtagagca agctgtttat 
841 tactgtaaga agtgcactta ccgagatcct ctttatgaaa tagttaggaa gcacatttac 
901 agggaacatt ttcagcatgt ggcagcacct tacatagcaa aggcaggaga aaaatcactc 
961 aatggggcag tccccttagg ctcgaatgcc cgagaagaga gtagtattca ctgcaagcga 

1021 tgccttttca tgccaaagtc ctatgaagct ttggtacagc atgtcatcga agaccatgaa 
1081 cgtataggct atcaggtcac tgccatgatt gggcacacaa atgtagtggt tccccgatcc 
1141 aaacccttga tgctaattgc tcccaaacct caagacaaga agagcatggg actcccacca 

12 01 aggatcggtt cccttgcttc tggaaatgtc cggtctttac catcacagca gatggtgaat 
1261 cgactctcaa taccaaagcc taacttaaat tctacaggag tcaacatgat gtccagtgtt 
1321 catctgcagc agaacaacta tggagtcaaa tctgtaggcc agggttacag tgttggtcag 

13 81 tcaatgagac tgggtctagg tggcaacgca ccagtttcca ttcctcaaca atctcagtct 
1441 gtaaagcagt tacttccaag tggaaacgga aggtcttatg ggcttgggtc agagcagagg 
1501 tcccaggcac cagcaagata ctccctgcag tctgctaatg cctcttctct ctcatcgggc 
1561 cagttaaagt ctccttccct ctctcagtca caggcatcca gagtgttagg tcagtccagt 
1621 tccaaacctg ctgcagctgc cacaggccct cccccaggta acacttcctc aactcaaaag 
1681 tggaaaatat gtacaatctg taatgagctt tttcctgaaa atgtctatag tgtgcacttc 
1741 gaaaaagaac ataaagctga gaaagtccca gcagtagcca actacattat gaaaatacac 
1801 aattttacta gcaaatgcct ctactgtaat cgctatttac ccacagatac tctgctcaac 
1861 catatgttaa ttcatggtct gtcttgtcca tattgccgtt caactttcaa tgatgtggaa 
1921 aagatggccg cacacatgcg gatggttcac attgatgaag agatgggacc taaaacagat 
1981 tctactttga gttttgattt gacattgcag cagggtagtc acactaacat ccatctcctg 
2 041 gtaactacat acaatctgag ggatgcccca gctgaatctg ttgcttacca tgcccaaaat 
2101 aatcctccag ttcctccaaa gccacagcca aaggttcagg aaaaggcaga tatccctgta 
2161 aaaagttcac ctcaagctgc agtgccctat aaaaaagatg ttgggaaaac cctttgtcct 
2221 ctttgctttt caatcctaaa aggacccata tctgatgcac ttgcacatca cttacgagag 
22 81 aggcaccaag ttattcagac ggttcatcca gttgagaaaa agctcaccta caaatgtatc 
2341 cattgccttg gtgtgtatac cagcaacatg accgcctcaa ctatcactct gcatctagtt 
2401 cactgcaggg gcgttggaaa gacccaaaat ggccaggata agacaaatgc accctctcgg 
2461 cttaatcagt ctccaagtct ggcacctgtg aagcgcactt acgagcaaat ggaatttccc 
2521 ttactgaaaa aacgaaagtt agatgatgat agtgattcac ccagcttctt tgaagagaag 
2581 cctgaagagc ctgttgtttt agctttagac cccaagggtc atgaagatga ttcctatgaa 
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2641 gccaggaaaa gctttctaac aaagtatttc aacaaacagc cctatcccac caggagagaa 
2701 attgagaagc tagcagccag tttatggtta tggaagagtg acatcgcttc ccattttagt 
2761 aacaaaagga agaagtgtgt ccgtgattgt gaaaagtaca agcctggcgt gttgctgggg 

2 821 tttaacatga aagaattaaa taaagtcaag catgagatgg attttgatgc tgagtggcta 
28 81 tttgaaaatc atgatgagaa ggattccaga gtcaatgcta gtaagactgc tgacaaaaag 
2941 ctcaaccttg ggaaggaaga tgacagttcc tcagacagtt ttgaaaattt ggaagaagaa 
3001 tccaatgaaa gtggtagccc ttttgaccct gtttttgaag ttgaacctaa aatctctaac 
3061 gataacccag aggaacatgt actgaaggta attcctgagg atgcttcaga atctgaggag 
3121 aagctagacc aaaaagagga tggttcaaaa tacgaaacta ttcatttgac tgaggaacca 
3181 accaaactaa tgcacaatgc atctgatagt gaggttgacc aagacgatgt tgttgagtgg 
3241 aaagacggtg cttctccatc tgagagtggg cctggatccc aacaagtgtc agactttgag 
33 01 gacaatacct gcgaaatgaa accaggaacc tggtctgacg agtcttccca aagcgaagat 
33 61 gcaaggagca gtaagccagc tgccaaaaaa aaggctacca tgcaaggtga cagagagcag 
3421 ttgaaatgga agaatagttc ctatggaaaa gttgaagggt tttggtctaa ggaccagtca 

15 3481 cagtggaaga atgcatctga gaatgatgag cgcttatcta acccccagat tgagtggcag 

3541 aatagcacaa ttgacagtga ggatggggaa cagtttgaca acatgactga tggagtagct 
3601 gagcccatgc atggcagctt agccggagtt aaactgagca gccaacaggc ctaagtgcca 

3 661 ggttccctgg cgttggtgac atgctgcagc ctggaactct gatctccagt gtgactgcaa 
3721 agctgtcttc tcactggtac tgccttgtga gtactggttg gactgtgggg catgtggccg 
3781 ctgcagttcc agtggttatt tctaagtcta tgacaggaca ggctgttctt gcttcagaac 
3841 cttctctgac agacacggta actaaatgtg aaaaaccaat aagctggtga ctcatgaata 
3 901 cacacgagga aaagcagagg tttattttat ctgccttttc aacatttctt tccctctgtg 
3 961 aaatgattgg tcagatgtct ttgagaagtg ttaaactaat tcacatggta gtgtagggcc 
4021 aacatacaag ctaccagtct aatgtgtata gtagactttg ggaaaagcga ttttttttca 
4081 tgtattcatt ctgaatagtt gaaatgtata tttgtacagt cttttagacc tattcaagtg 
4141 atgctcatga tcctgttact gtgtgcccat catagatttc tttttttagt gttgcccttg 
42 01 ctgtgtaata aacgctctat ctagtttacc tagcaaaagc tcaaaactgc gctagtatgg 
4261 actttttgga cagacttagt ttttgcacat aaccttgtac aatcttgcaa cagaggccag 
4321 ccacgtaaga tatatatctg gactctcttg tattatagga tttttcttgt tctgaatatc 
4381 cttgacatta cagctgtcaa aaacaaaaac tggtatttca gatctgtttt ctgaaatctt 
4441 ttaagctaaa atcacatgca agaattgact ttgcagctac taattttgac accttttaga 
4501 tctgtataaa agtgtgttgt gttgaagcag caaaccaatg agtgctgcat tttggatatt 
4561 tagttttatc tttagttcaa caccatcatg gtggattcat ttataccatc taatatatga 
4621 cacactgttg tagtatgtat aattttgtga tctttatttt ccctttgtat tcattttaag 
4681 catctaaata aattgctgta ttgtgcttaa tgt 

SEQ ID NO: 20 

Activity dep. Neuroprotector (ADNP) AF250860 

/ translation^ "MFQLPVNNLGSLRKARKTVKKILSDIGLEYCKEHIEDFKQFEPN 
40 CFYLKNTTWEDVGLWDPSLTKNQDYRTKPFCCSACPFSSKFFSAYKSHFRNVHSEDFE 
NRILLNCPYCTFNADKKTLETHIKIFHAPNASAPSSSLSTFKDKNKNDGLKPKQADSV 
EQAVYYCKKCTYRDPLYEIVRKHIYREHFQHVAAPYIAKAGEKSIiNGAVPLGSNAREE 
SSIHCKRCLFMPKSYEALVQHVIEDHERIGYQVTAMIGHTNVVVPRSKPLMLIAPKPQ 
DKKSMGLPPRIGSLASGmrcSLPSQQMVNRLSIPK^^ 

45 KSVGQGYSVGQSMRLGLGGNAPVSIPQQSQSVKQLLPSGNGRSYGLGSEQRSQAPARY 
SLQSANASSLSSGQLKSPSLSQSQASRVLGQSSSKPAAAATGPPPGNTSSTQKWKICT 
ICNELFPENVYSVHFEKEHKAEKVPAVANYIMKIHNFTSKCLYCMRYLPTDTLLNHML 
IHGLSCPYCRSTFNDVEKMAAHMRMVHIDEEMGPKTDSTLSFDIiTLQQGSHTNIHLLV 
TTYNLRDAPAES VAYHAQNNPPVP PKPQPKVQEKADI PVKS S PQAAVP YKKDVGKTLC 
50 PLCFSILKGPISDAIjAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNMTASTITL 
HLVHCRGVGKTQNGQDKTNAPSRLNQSPSLAPVKRTYEQMEFPLLKKRKLDDDSDSPS 
FFEEKPEEPVVLAIjDPKGHEDDSYEARKSFLTKYFNKQPYPTRREIEKIjAASIjWLWKS 
DIASHFSNKRKKCVRDCEKYKPGVLLGFNMKELNKVKHE^ 



20 



25 



30 



35 



NASKTADKKLNLGKEDDSSSDSFENLEEESNESGSPFDPVFEVEPKISNDNPEEHVLK 

55 VIPEDASESEEKLDQKEDGSKYETIHLTEEPTKLMHNASDSEVDQDDVVEWKDGASPS 

ESGPGSQQVSDFEDNTCEMKPGTWSDESSQSEDARSSKPAAKKKATMQGDREQLKWKN 

SSYGKVEGFWSKDQSQWKNASENDERLSNPQIEWQNSTIDSEDGEQFDNMTDGVAEPM 
HG S LAGVKLS S QQA " 
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SEQIDNO:21 
C20orfl29 AK055793 

5 1 aaaaagcagc caatgggaga gccgaggcgg ggaggtgcgg ccaatggcgc gggcctgttt 

* 61 gattcaaagg ttgcctataa agcgggactg cacgccggtt tttgtccgag ggctgtcgag 

121 tccgagcgcc gccatggctc tgctgtccga gggcctggac gaggtgcccg ccgcctgcct 

181 gtcgccgtgc gggccgccca acccgaccga gctgttcagc gagtcacggc gcctggctct 

241 ggaggagctg gtggcgggcg gccccgaagc cttcgcggcc ttcctgcgac gcgagcgcct 

10 3 01 ggctcgtttc ctgaaccccg atgaggtgca cgccattctg cgcgcggcgg agaggccggg 

361 agaggagggc gcggcggcgg cggcggcggc cgaggactcg ttcggctcct cgcacgactg 

421 ctcttcgggc acctacttcc ccgagcagtc ggacctggag ccaccgctgt tggagcttgg 

481 ctggcccgcc ttctaccagg gcgcctaccg cggcgccacg cgtgtcgaga cgcacttcca 

541 gccccgcggc gctggcgaag gtggccccta cggctgcaag gacgctctgc gccagcagct 

15 601 ccgctcggcg cgagaggtga ttgcagtggt catggacgtg ttcacagaca tcgacatctt 

661 cagagacctg caagaaatat gcaggaaaca gggagttgct gtgtatatcc ttctggacca 

721 ggctctcctc tctcaatttc tggatatgtg catggatctg aaagttcatc ctgaacagga 

781 aaagttaatg acagttcgga ctatcacagg aaatatctac tatgcaaggt caggaactaa 

841 gattattggg aaggttcacg aaaagttcac gttgattgat ggcatccgcg tggcaacagg 

20 901 ctcctacagt tttacataga cggatggcaa attaaacagc agtaacttgg taattctgtc 

961 tggccaagtg gttgaacact ttgatctgga gttccgaatc ctgtatgccc agtccaagcc 

1021 catcagcccc aaactcctgt ctcacttcca gagcagcaac aagtttgatc acctcaccaa 

1081 ccgaaaacca cagtccaagg agctcaccct gggcaacctg ctgcggatgc ggctggctag 

1141 gctgtcaagt actcccagga aggcggacct ggacccagag atgcccgcag agggcaaggc 

25 1201 agagcgcaag ccccatgact gtgagtcctc tactgttagt gaggaagact acttcagcag 

1261 ccacagggac gagctccaga gcagaaaggc cattgacgct gccactcaaa cagagccagg 

1321 agaggagatg ccagggctga gtgtgagtga ggtgggaaca caaaccagca tcaccacagc 

1381 atgtgctggt acccagactg cagtcatcac caggatagca agctctcaaa ccacgatttg 

1441 gtccagatcg accactactc agactgacat ggatgagaac attctctttc ctcgaggaac 

30 1501 tcaatctaca gaagggtcac cagtctcaaa aatgtctgta tcgagatctt ccagtttgaa 

1561 gtcttcctcc tctgtgtctt cccaaggctc tgtggcaagc tccactggtt ctcccgcttc 

1621 catcagaacc actgacttcc acaatcctgg ctatcccaag tacctgggca ccccccacct 

1681 ggaactgtac ttgagtgact cacttagaaa cttgaacaaa gagcggcaat tccacttcgc 

1741 tggtatcagg tcccggctca accacatgct ggctatgctg tcaaggagaa cactctttac 

35 1801 tgaaaaccac cttggccttc attctggcaa tttcagcaga gttaatttgc ttgctgttag 

1861 agatgtagca ctttatcctt cctatcagta actgctccgt gttcagactc ctggtttctt 

1921 ccaggcttac agtggacatc atcagcttcc tgctttaaaa aatatcttat gtccctaatt 

1981 gcctttcttt tacctgactt tgtcaccttt gttgtctttg aattctttag gctgcatatt 

2041 attttacatg ctttgttttg tcatgtatat accaggtatt ggttttatgg tttaaacact 

40 2101 atggatacag gggtttgttt tgcacaattt taatagtcat gcactacata atgatgtttt 

2161 ggtcaatgac agaccacgta tatgttggca gtctcataag attataatac tgtattttta 

2221 ctataccttt tctgtgttta gatacaaata ccattatgtt acagttgcct acagtattca 

2281 gtgcagtaac atgatgtaca ggtttgtagc ctgttttgca tttttcttag gttgtatgct 

2341 cttctgtttt aaaggtttga atcaccagca tttttgtgat caaaatccta tttagaaaaa 

45 2401 ataaaactac tttctgttta tctcttt 

SEQ ID NO: 22 
C20orfl29 AK055793 

/ 1 rans 1 a t i on= " MARACL I QRL P I KRD CTP VF VRGLS S P S AAMALL S EGLDE VP AA 
50 CLS PCGPPNPTELFSESRRLALEELVAGGPEAFAAFLRRERLARFIiNPDE VHAI LRAA 
ERPGEEGAAAAAAAEDSFGSSHDCSSGTYFPEQSDLEPPLLELGWPAFYQGAYRGATR 
VETHFQPRGAGEGGP YGCKDALRQQLRS ARE VI AWMDVFTD ID I FRDLQE I CRKQGV 
AVYILLDQALLSQFLDMCMDLKVHPEQEKLMTVRTITGNIYYARSGTKIIGKVHEKFT 
LIDGIRVATGS YSFT » 

55 
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SEQIDNO: 23 
C20orf52 BC008488 

1 gacgcggggc cggaacgcga agagggtggt ggagtcgggc tacccactga ttttccttcc 

61 cttacttccc ctgagccctt gggcccactt cccagcctac cgcttccgtc cccgcccgac 

5 121 tcttgggcca gcgcctgggc ccacactttc ctatcccccg cagatgccgg tggccgtggg 

181 tccctacgga cagtcccagc caagctgctt cgaccgtgtc aaaatgggct tcgtgatggg 

241 ttgcgccgtg ggcatggcgg ccggggcgct cttcggcacc ttttcctgtc tcaggatcgg 

3 01 aatgcggggt cgagagctga tgggcggcat tgggaaaacc atgatgcaga gtggcggcac 
361 ctttggcaca ttcatggcca ttgggatggg catccgatgc taaccatggt tgccaactac 

10 421 atctgtccct tcccatcaat cccagcccat gtactaataa aagaaagtct ttgagtaaaa 

4 81 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
541 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
601 aa 

15 SEQIDNO: 24 

C20orf52 BC008488 

/translation "MPVAVGPYGQSQPSCFDRVKMGFVMGCAVGMAAGALFGTFSCIjR 
I GMRGRELMGGI GKTMMQS GGTFGTFMAI GMGIRC " 

20 SEQ ID NO: 25 

C20orf20 BC018841 

1 ctcccgccgg gggctccttg ctcggccggg ccgcggccat gggagaggcc gaggtgggcg 

61 gcgggggcgc cgcaggcgac aagggcccgg gggaggcggc caccagcccg gcggaggaga 

121 cagtggtgtg gagccccgag gtggaggtgt gcctcttcca cgccatgctg ggccacaagc 

25 181 ccgtcggtgt gaaccgacac ttccacatga tttgtattcg ggacaagttc agccagaaca 

241 tcgggcggca ggtcccatcc aaggtcatct gggaccatct gagcaccatg tacgacatgc 

301 aggcgctgca tgagtctgag attcttccat tcccgaatcc agagaggaac ttcgtccttc 

3 61 cagaagagat cattcaggag gtccgagaag gaaaagtgat gatagaagag gagatgaaag 

421 aggagatgaa ggaagacgtg gacccccaca atggggctga cgatgttttt tcatcttcag 

30 4 81 ggagtttggg gaaagcatca gaaaaatcca gcaaagacaa agagaagaac tcctcagact 

541 tggggtgcaa agaaggcgca gacaagcgga agcgcagccg ggtcaccgac aaagtcctga 

601 ccgcaaacag caacccttcc agtcccagtg ctgccaagcg gcgccgcacg tagaccctca 

661 gccctggtgg cggcagagaa gcgggcgagg cactgtggtc gctgaggggg ttggctgggt 

721 ctgagtgcca cccccaggcc acagtgatac catcccagtg ccatgagccc acactgcccg 

35 781 ccctcaggct ctcaggtgaa cgtggccgtc agcggggaaa cgtgtgtgtc agttggacca 

841 tgtgggaccc tgatggacct gaaagaccag gatcggtcca gctcagatat tgagggctct 

901 gaagcctagt tctgtcttct ctggagcagc tgtggcttcc ccgtggctgc ttggtgacat 

961 ggattagcgc tacgtgggct gcagcatttg ggatccaggc tacctagagg ggcatcgggc 

1021 cagggaaaac ctcggattag caagcaataa aaacatgacc tcactcttcc tcaaaggagc 

40 10 81 ccctggtctt ccctgtgtga ctcagttctt tccatctgtt tgtcccgctg caagcctctt 

1141 tctgcgctga ctgtgacatt ggaacgtggc cttcctgtca ccccctccgt gccacgcact 

1201 gaaggccacc cccacccacc tgggaaacta agaactggat attttgcctc attcacttgt 

1261 actgtaacaa tgtatataat ttggttggta tttcactatt taatttttaa gaagcctatt 

1321 ttactagtgt tttatatgaa caaagtactg cagaagttaa acctgtgttg tattttttct 

45 13 81 gagatgtttt gctttaagag atactttttg ctcagttttt atatgccaga tacagagaat 

1441 ttgtagcggt tatttttgta tgatctagta acttgcaaac agaccaaatg gatgagaggc 

1501 ggggaccgtg cagctgtcgg ctgatgagga ggcggccgcc ccagtgctga tggagatgcc 

1561 actttcgtgt gactgcgaac attaaagcac aaaaaaaatc caaaaaaaaa aaaaaaaaaa 

1621 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
50 1681 aaaaa 
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SEQIDNO:26 
C20orf20 BC018841 

/ trans la tion= " MGEAEVGGGGAAGDKGPGEAATSPAEETVVWSPEVEVCLFHAML 
GHKPVGVNRHFHMICIRDKFSQNIGRQVPSKVIWDffi 

RNFVLPEEIIQEVREGKVMIEEEMKEEMKEDVDPHNGADDVFSSSGSLGKA.SEKSSKD 
KEKNSSDLGCKEGADKRKRSRVTDKVLTANSNPSSPSAAKRRRT" 



SEQIDNO: 27 
C20orfl88 



BC013144 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



661 
721 
781 
841 



1021 
1081 
1141 
1201 
1261 
1321 
1381 



1 gacatggcgg cggcgccggt agcggctggg tctggagccg gccgagggag acggtcggca 

61 gccacagtgg cggcttgggg cggatggggc ggccggccgc ggcctggtaa cattctgctg 
121 cagctgcggc agggccagct gaccggccgg ggcctggtcc gggcggtgca gttcactgag 
181 acttttttga cggagaggga caaacaatcc aagtggagtg gaattcctca gctgctcctc 
241 aagctgcaca ccaccagcca cctccacagt gactttgttg agtgtcaaaa catcctcaag 
301 gaaatttctc ctcttctctc catggaggct atggcatttg ttactgaaga gaggaaactt 

3 61 acccaagaaa ccacttatcc aaatacttat atttttgact tgtttggagg tgttgatctt 
421 cttgtagaaa ttcttatgag gcctacgatc tctatccggg gacagaaact gaaaataagt 

4 81 gatgaaatgt ccaaggactg cttgagtatc ctgtataata cctgtgtctg tacagaggga 
541 gttacaaagc gtttggcaga aaagaatgac tttgtgatct tcctgtttac attgatgaca 
601 agtaagaaga cattcttaca aacagcaacc ctcattgaag atattttggg tgttaaaaag 

gaaatgatcc gactagatga agtccccaat ctgagttcct tagtatccaa tttcgatcag 
cagcagctcg ctaatttctg ccggattctg gctgtcacca tttcagagat ggatacaggg 
aatgatgaca agcacacgct tcttgccaaa aatgctcaac agaagaagag cttgagtttg 
gggccttctg cagctgaaat caatcaagcg gcccttctca gcattcctgg ctttgttgag 
901 cggctttgca aactggcgac tcgaaaggtg tcagagtcaa cgggcacagc cagcttcctt 
961 caggagttgg aagagtggta cacatggcta gacaatgctt tggtgctaga tgccctgatg 
cgagtggcca atgaggagtc agagcacaat caagcctcca ttgtgttccc tcctccaggg 
gcttctgagg agaatggcct gcctcacacg tcagccagaa cccagctgcc ccagtcaatg 
aagattatgc atgagatcat gtacaaactg gaagtgctct atgtcctctg cgtgctgctg 
atggggcgtc agcgaaacca ggttcacaga atgattgcag agttcaagct gatccctgga 
cttaataatt tgtttgacaa actgatttgg aggaagcatt cagcatctgc ccttgtcctc 
catggtcaca accagaactg tgactgtagc ccggacatca ccttgaagat acagtttttg 
aggcttcttc agagcttcag tgaccaccac gagaacaagt acttgttact caacaaccag 
1441 gagctgaatg aactcagtgc catctctctc aaggccaaca tccctgaggt ggaagctgtc 
1501 ctcaacaccg acaggagttt ggtgtgtgat gggaagaggg gcttattaac tcgtctgctg 
1561 caggtcatga agaaggagcc agcagagtcg tctttcaggt tttggcaagc tcgggctgtg 
1621 gagagtttcc tccgagggac cacctcctat gcagaccaga tgttcctgct gaagcgaggc 
1681 ctcttggagc acatccttta ctgcattgtg gacagcgagt gtaagtcaag ggatgtgctc 
1741 cagagttact ttgacctcct gggggagctg atgaagttca acgttgatgc attcaagaga 
ttcaataaat atatcaacac cgatgcaaag ttccaggtat tcctgaagca gatcaacagc 
tccctggtgg actccaacat gctggtgcgc tgtgtcactc tgtccctgga ccgatttgaa 
aaccaggtgg atatgaaagt tgccgaggta ctgtctgaat gccgcctgct cgcctacata 
tcccaggtgc ccacgcagat gtccttcctc ttccgcctca tcaacatcat ccacgtgcag 
acgctgaccc aggagaacgt cagctgcctc aacaccagcc tggtgatcct gatgctggcc 
cgacggaaag agcggctgcc cctgtacctg cggctgctgc agcggatgga gcacagcaag 
aagtaccccg gcttcctgct caacaacttc cacaacctgc tgcgcttctg gcagcagcac 
tacctgcaca aggacaagga cagcacctgc ctagagaaca gctcctgcat cagcttctca 
tactggaagg agacagtgtc catcctgttg aacccggacc ggcagtcacc ctctgctctc 
gttagctaca ttgaggagcc ctacatggac atagacaggg acttcactga ggagtgacct 
tgggccaggc ctcgggaggc tgctgggcca gtgtgggtga gcgtgggtac gatgccacac 
gccctgccct gttcccgttc ctccctgctg ctctctgcct gccccaggtc tttgggtaca 
2521 ggcttggtgg gagggaagtc ctagaagccc ttggtccccc tgggtctgag ggccctaggt 
2581 catggagagc ctcagtcccc ataatgagga cagggtacca tgcccacctt tccttcagaa 
2641 ccctggggcc cagggccacc cagaggtaag aggacattta gcattagctc tgtgtgagct 
2701 cctgccggtt tcttggctgt cagtcagtcc cagagtgggg aggaagatat gggtgacccc 
2761 caccccccat ctgtgagcca agcctccctt gtccctggcc tttggaccca ggcaaaggct 
2 821 tctgagccct gggcaggggt ggtgggtacc agagaatgct gccttccccc aagcctgccc 



1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
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2881 ctctgcctca ttttcctgta gctcctctgg ttctgtttgc tcattggctg ctgtgttcat 

2941 ccaagggggt tctcccagaa gtgaggggcc tttccctcca tcccttgagg cacggggcag 

3001 ctgtgcctgc cctgcctctg cctgaggcag ccgctcctgc ctgagcctgg acatggggcc 

3061 cttccttgtg ttgccaattt attaacagca aataaaccaa ttaaatggag actattaaat 
3121 aactttattt taaaaaaaaa aaaaaaaaa 



SEQ ID NO: 28 

C20orfl88 BC013144 

/ trans 1 a t i on= 11 MAAAPVAA6SGAGRGRRS AATVAAWGGWGGRPRPGNILLQLRQG 
QLTGRGLVRAVQFTETFLTERDKQSKWSGI PQLLLKLHTTSHIiHSDFVECQNILKE I S 
PLIiSMEAMAFVTEERKLTQETTYPNTYIFDLFGGVDLLVEILMRPTISIRGQKLKISD 
EMSKDCLSILYNTCVCTEGVTKRLAEKNDFV^ 

KEMIRLDEVPNLS SLVSNFDQQQLANFCRILAVTI SEMDTGNDDKHTLLAKNAQQKKS 
LSLGPS AAEINQAALLS I PGFVERLCKLATRKVS ESTGTAS FLQELEEWYTWLDNALV 
LDAIiMRVAl^ESEHNQASIVFPPPGASEENGLPHTSARTQLPQSMKIlffiEIMYKLEVL 
YVLCVXjLMGRQRNQVHRMIAEFKLIPGL^^ 
DITLKIQFLRIjLQSFSDHHENKYLIjLN^^ 

DGKRGLLTRLIiQVMKKEPAESSFRFWQARAVESFLRGTTSYADQMFLLKRGLLEHILY 
C I VDSECKSRDVXaQS YFDLLGELMKFNVD AFKRFNKY INTD AKFQVFLKQ INS SLVDS 
NMLVRC\TTLSIiDRFENQVDMKVAEVLSECRIiIA 

QENVS CLNTS LVI LMLARRKERL PL YLRLLQRMEHS KKYPGFLLNNFHNIiLRF WQQHY 
LHKDKDSTCLENSSCISFSYWKETVSILIiNPDRQSPSAIiVSYIEEPYMDIDRDFTEE" 
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